fit_generator()期间的培训冻结>> [

问题描述 投票:0回答:1
我尝试训练我的6000个训练数据集和1000个验证数据集,但是我有一个问题,它只是在训练过程中冻结挂起而没有错误消息

1970/6000 [========>.....................] - ETA: 1:50:11 - loss: 1.2256 - accuracy: 0.5956 1971/6000 [========>.....................] - ETA: 1:50:08 - loss: 1.2252 - accuracy: 0.5958 1972/6000 [========>.....................] - ETA: 1:50:08 - loss: 1.2248 - accuracy: 0.5960 1973/6000 [========>.....................] - ETA: 1:50:06 - loss: 1.2245 - accuracy: 0.5962 1974/6000 [========>.....................] - ETA: 1:50:04 - loss: 1.2241 - accuracy: 0.5964 1975/6000 [========>.....................] - ETA: 1:50:02 - loss: 1.2243 - accuracy: 0.5961 1976/6000 [========>.....................] - ETA: 1:50:00 - loss: 1.2239 - accuracy: 0.5963 1977/6000 [========>.....................] - ETA: 1:49:58 - loss: 1.2236 - accuracy: 0.5965 1978/6000 [========>.....................] - ETA: 1:49:57 - loss: 1.2241 - accuracy: 0.5962 1979/6000 [========>.....................] - ETA: 1:49:56 - loss: 1.2237 - accuracy: 0.5964 1980/6000 [========>.....................] - ETA: 1:49:55 - loss: 1.2242 - accuracy: 0.5961 1981/6000 [========>.....................] - ETA: 1:49:53 - loss: 1.2252 - accuracy: 0.5958 1982/6000 [========>.....................] - ETA: 1:49:52 - loss: 1.2257 - accuracy: 0.5955

我等待5-6分钟,但似乎什么也没发生。我试图解决像1.将steps_per_epoch更改为100并将epoch增大为202.我认为这是功能ReduceLROnPlateau的问题,因此我将添加冷却时间= 1但是2解决方案不能解决这个问题

我的实验室i5-8300hGtx 1060 6GBkeras 2.0使用gpu进行编译

我的代码

import numpy as np import pandas as pd import matplotlib.pyplot as plt import keras import tensorflow as tf from skimage import exposure, color from keras.optimizers import Adam from tqdm import tqdm from keras.models import Model from keras.utils import to_categorical from keras.models import Sequential from keras.layers import Conv2D, MaxPooling2D,Convolution2D from keras.layers import Activation, Dropout, Flatten, Dense from keras.callbacks import EarlyStopping, ReduceLROnPlateau, ModelCheckpoint, Callback from keras import regularizers from keras.applications.densenet import DenseNet121 from keras_preprocessing.image import ImageDataGenerator from sklearn.utils import class_weight from collections import Counter config = tf.compat.v1.ConfigProto() config.gpu_options.allow_growth=True session = tf.compat.v1.Session(config=config) # Histogram equalization def HE(img): img_eq = exposure.equalize_hist(img) return img_eq def plotImages(images_arr): fig, axes = plt.subplots(1, 5, figsize=(20,20)) axes = axes.flatten() for img, ax in zip( images_arr, axes): ax.imshow(img) ax.axis('off') plt.tight_layout() plt.show() train_datagen = ImageDataGenerator( rescale=1. / 255, rotation_range=40, zoom_range=0.2, horizontal_flip=True, fill_mode='nearest', preprocessing_function=HE, ) validation_datagen = ImageDataGenerator( rescale=1./255 ) test_datagen = ImageDataGenerator( rescale=1./255 ) #get image and label with augmentation train = train_datagen.flow_from_directory( 'train/train_deep/', target_size=(224,224), class_mode='categorical', shuffle=False, batch_size = 20, ) test = test_datagen.flow_from_directory( 'test_deep/', batch_size=1, target_size = (224,224), ) val = validation_datagen.flow_from_directory( 'train/validate_deep/', target_size=(224,224), batch_size = 20, ) #Training X_train, y_train = next(train) class_names = ['No DR', 'Mild', 'Moderate', 'Severe', 'Proliferative DR'] counter = Counter(train.classes) class_weights = class_weight.compute_class_weight( 'balanced', np.unique(train.classes), train.classes) #X_test , y_test = next(test) #X_test=np.reshape(X_test,(X_test.shape[0],X_test.shape[1],X_test.shape[2])) #Training parameter batch_size =32 Epoch = 2 model = DenseNet121(include_top=True, weights=None, input_tensor=None, input_shape=(224,224,3), pooling=None, classes=5) model.compile(loss='categorical_crossentropy', optimizer=Adam(learning_rate=0.01), metrics=['accuracy']) model.summary() filepath="weights-improvement-{epoch:02d}-{val_loss:.2f}.hdf5" checkpointer = ModelCheckpoint(filepath,monitor='val_loss', verbose=1, save_best_only=True,save_weights_only=True) lr_reduction = ReduceLROnPlateau(monitor='val_loss', patience=5, verbose=2, factor=0.2,cooldown=1) callbacks_list = [checkpointer, lr_reduction] #Validation X_val , y_val = next(val) #history = model.fit(X_train,y_train,epochs=Epoch,validation_data = (X_val,y_val)) history = model.fit_generator( train, epochs=Epoch, steps_per_epoch=6000, class_weight=class_weights, validation_data=val, validation_steps=1000, use_multiprocessing = False, max_queue_size=100, workers = 1, callbacks=callbacks_list ) # Score trained model. scores = model.evaluate(X_val, y_val, verbose=1) print('Test loss:', scores[0]) print('Test accuracy:', scores[1]) #predict test.reset() pred=model.predict_generator(test, steps=25,) print(pred) for i in pred: print(np.argmax(i))

我尝试训练我的6000训练数据集和1000验证数据集,但我有一个问题,它只是在训练过程中死机而没有错误消息1970/6000 [========> ........ ......................] ...
python keras keras-layer tf.keras keras-2
1个回答
0
投票
如果您使用的是Keras <2.0.0(我不建议您使用旧版本),那么此代码会很好用。
最新问题
© www.soinside.com 2019 - 2025. All rights reserved.