[英]How can I use a generator to run my Neural Network model?
I have a neural network model that run perfectly, but I am using a very large data and I try to use a generator to run the model, but it gives me the following error:我有一个运行完美的神经网络模型,但我使用了非常大的数据,我尝试使用生成器来运行模型,但它给了我以下错误:
"UnimplementedError:
File "<ipython-input-63-352f4097b60f>", line 146, in <module>
validation_steps = len(df_valid)/batch_size, shuffle=True)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 1297, in fit_generator
steps_name='steps_per_epoch')
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\keras\engine\training_generator.py", line 265, in model_iteration
batch_outs = batch_function(*batch_data)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 973, in train_on_batch
class_weight=class_weight, reset_metrics=reset_metrics)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py", line 264, in train_on_batch
output_loss_metrics=model._output_loss_metrics)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 311, in train_on_batch
output_loss_metrics=output_loss_metrics))
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 252, in _process_single_batch
training=training))
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 166, in _model_loss
per_sample_losses = loss_fn.call(targets[i], outs[i])
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\keras\losses.py", line 221, in call
return self.fn(y_true, y_pred, **self._fn_kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\keras\losses.py", line 978, in sparse_categorical_crossentropy
y_true, y_pred, from_logits=from_logits, axis=axis)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\keras\backend.py", line 4530, in sparse_categorical_crossentropy
target = cast(target, 'int64')
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\keras\backend.py", line 1571, in cast
return math_ops.cast(x, dtype)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\util\dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\ops\math_ops.py", line 704, in cast
x = gen_math_ops.cast(x, base_type, name=name)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\ops\gen_math_ops.py", line 2211, in cast
_six.raise_from(_core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
UnimplementedError: Cast string to int64 is not supported [Op:Cast] name: loss/dense_38_loss/Cast/
What is wrong with the generator?发电机有什么问题?
import numpy as np
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from keras.preprocessing.text import Tokenizer
from keras.preprocessing import sequence
from keras.utils import np_utils
import pandas as pd
batch_size = 64
# Read in the training and validation data
df_train = pd.read_csv(r'C:\train.txt', sep='|', encoding='latin-1',
low_memory=False)
df_valid = pd.read_csv(r'C:\valid.txt', sep='|', encoding='latin-1',
low_memory=False)
print('training rows:', len(df_train))
print('validation rows:', len(df_valid))
def generator(df,vocab_size,batch_size, tokenizer,input_encoder,onehot):
n_examples = len(df)
number_of_batches = n_examples / batch_size
counter = 0
while 1:
start_index = counter * batch_size
end_index = start_index + batch_size
X_out1 = np.array(([batch_size, n_examples, vocab_size]), dtype=int)
if counter > number_of_batches + 1:
# reshuffle dataframe and start over
df.sample(frac=1).reset_index(drop=True)
counter = 0
counter += 1
X_out1 = tokenizer.texts_to_sequences(df.iloc[start_index: end_index]['var1'])
X_out1 = sequence.pad_sequences(X_out1, maxlen=200)
X_out2 = df.iloc[start_index: end_index][['var2','var3']]
X_out2 = input_encoder.transform(df.iloc[start_index: end_index][[ 'var2','var3']])
X_out2 = onehot.transform(df.iloc[start_index: end_index][[ 'var2','var3']])
Y_out = df.iloc[start_index: end_index]['code']
yield [X_out1, X_out2], [Y_out]
tokenizer = Tokenizer()
tokenizer.fit_on_texts(df_train['var1'])
input_encoder = MultiColumnLabelEncoder()
train_X2=df_train[['var2','var3']]
valid_X2 =df_valid[['var2','var3']]
input_encoder.fit(train_X2)
onehot = OneHotEncoder(sparse=False,categories='auto')
onehot.fit(train_X2)
code_type = 'code'
train_labels = df_train[code_type]
valid_labels = df_valid[code_type]
label_encoder = LabelEncoder()
labels = set(df_train[code_type].tolist() + df_valid[code_type].tolist())
label_encoder.fit(list(labels))
n_classes = len(set(labels))
print('n_classes = %s' % n_classes)
input_text = Input(shape=(200,), dtype='int32', name='input_text')
meta_input = Input(shape=(2,), name='meta_input')
embedding = Embedding(input_dim=len(tokenizer.word_index) + 1,
output_dim=300,
input_length=200)(input_text)
lstm = Bidirectional(LSTM(units=128,
dropout=0.5,
recurrent_dropout=0.5,
return_sequences=True),
merge_mode='concat')(embedding)
pool = GlobalMaxPooling1D()(lstm)
dropout = Dropout(0.5)(pool)
text_output = Dense(n_codes, activation='sigmoid', name='aux_output')(dropout)
output = concatenate([text_output, meta_input])
output = Dense(n_codes, activation='relu')(output)
main_output = Dense(n_codes, activation='softmax', name='main_output')(output)
model = Model(inputs=[input_text,meta_input], outputs=[output])
optimer = Adam(lr=.001)
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
# Generators
train_generator = generator(df_train,vocab_size,batch_size, tokenizer,input_encoder,onehot)
validation_generator = generator(df_valid,vocab_size,batch_size, tokenizer,input_encoder,onehot)
model.summary()
model.fit_generator(generator=train_generator,
validation_data=validation_generator,
epochs=20,steps_per_epoch = len(df_train)/batch_size,
validation_steps = len(df_valid)/batch_size, shuffle=True)
From what I can see the problem is a type mismatch due to the use of a CSV file.从我可以看到的问题是由于使用 CSV 文件导致的类型不匹配。 A number (the labels) is read from the CSV as a string and it's not converted to an int automatically.
一个数字(标签)作为字符串从 CSV 中读取,它不会自动转换为 int。
This is the expected outputs (labels) which you read from the CSV:这是您从 CSV 中读取的预期输出(标签):
Y_out = df.iloc[start_index: end_index]['code']
It is probably an string, try printing Y_out.dtypes
to confirm.它可能是一个字符串,尝试打印
Y_out.dtypes
以确认。
Your model's output (prediction) is a one-hot encoded label which is an int64
;您的模型的输出(预测)是一个单热编码标签,它是一个
int64
; the type mismatch happens when TF is trying to calculate the loss value subtracting a number from a string.当 TF 尝试计算从字符串中减去一个数字的损失值时,就会发生类型不匹配。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.