[英]Neural network can't seem to learn a simple relationship TensorFlow
I'm experimenting with TensorFlow (which seems amazing so far!) and I'm playing around with a toy example a 1 class classification problem. 我正在尝试TensorFlow(到目前为止看起来还不错!),并且正在玩一个玩具示例1类分类问题。 I'm generating some features and if the first feature is above a threshold then the example is "positive" 我正在生成一些特征,如果第一个特征高于阈值,则该示例为“正”
Full code here: https://gist.github.com/tnbredillet/f136c2bc40815517e0aa1139bd2060ee 完整代码在这里: https : //gist.github.com/tnbredillet/f136c2bc40815517e0aa1139bd2060ee
The problem is that it seems that the model is unable to capture that simple relationship. 问题在于,该模型似乎无法捕获这种简单的关系。 Of course I'm missing a lot of stuff (CV, regularization, batch normalization, hyperparameter tuning) to name a few. 当然,我缺少很多东西(CV,正则化,批处理归一化,超参数调整),仅举几例。 But still I would expect the model to manage to figure that one out right ? 但是我仍然希望模型能够设法解决这个问题,对吗? Maybe there's simply a bug in my code? 也许我的代码中只是一个错误?
Would welcome any insights :-) 欢迎任何见解:-)
EDIT: 编辑:
Data generating code: 数据生成代码:
num_examples = 100000
split = 0.2
num_features = 1
def generate_input_data(num_examples, num_features):
features = []
labels = []
for i in xrange(num_examples):
features.append(np.random.rand(num_features) * np.random.randint(1, 10) + np.random.rand(num_features))
if np.random.randint(101) > 90:
features[i-1][np.random.randint(num_features)] = 0
hard = ceil(np.sum(features[i-1])) % 2
easy = 0
if features[i-1][0] > 3:
easy = 1
labels.append(easy)
df = pd.concat(
[
pd.DataFrame(features),
pd.Series(labels).rename('labels')
],
axis=1,
)
return df
def one_hot_encoding(train_df):
#TODO: handle categorical feature one hot encoding.
return 0, 0
def scale_data(train_df, test_df):
categorical_columns, encoding = one_hot_encoding(train_df)
scaler = MinMaxScaler(feature_range=(0,1))
scaler.fit(train_df.drop(['labels'], axis=1))
train_df = pd.concat(
[
pd.DataFrame(scaler.transform(train_df.drop('labels', axis=1))),
train_df['labels']
],
axis=1,
)
test_df = pd.concat(
[
pd.DataFrame(scaler.transform(test_df.drop('labels', axis=1))),
test_df['labels']
],
axis=1,
)
return train_df, test_df
def preprocess_data(train_df, test_df):
all_dfs = [train_df, test_df]
features = set()
for df in all_dfs:
features |= set(df.columns)
for df in all_dfs:
for f in features:
if f not in df.columns:
df[f] = 0.0
for df in all_dfs:
df.sort_index(axis=1, inplace=True)
train_df, test_df = scale_data(train_df, test_df)
train_df = shuffle(train_df).reset_index(drop=True)
return train_df, test_df
def get_data(num_examples, split):
train_df = generate_input_data(num_examples, num_features)
test_df = generate_input_data(int(ceil(num_examples*split)), num_features)
return preprocess_data(train_df, test_df)
def get_batch(df, batch_size, epoch):
start = batch_size*epoch-batch_size
end = batch_size*epoch
if end > len(df):
end = len(df)
size = end - start
batch_x = df.drop('labels', axis=1)[start:end].as_matrix()
batch_y = df['labels'][start:end].as_matrix().reshape(size, 1)
return batch_x, batch_y
And the network definition/training and evaluation: 以及网络定义/培训和评估:
train_df, test_df = get_data(num_examples, split)
n_hidden_1 = 8
n_hidden_2 = 4
learning_rate = 0.01
batch_size = 500
num_epochs = 200
display_epoch = 50
def neural_net(x):
layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
return out_layer
weights = {
'h1': tf.Variable(tf.random_normal([num_features, n_hidden_1])),
'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
'out': tf.Variable(tf.random_normal([n_hidden_2, 1]))
}
biases = {
'b1': tf.Variable(tf.random_normal([n_hidden_1])),
'b2': tf.Variable(tf.random_normal([n_hidden_2])),
'out': tf.Variable(tf.random_normal([1]))
}
X = tf.placeholder(tf.float32, shape=(None, num_features))
Y = tf.placeholder(tf.float32, shape=(None, 1))
logits = neural_net(X)
loss_op = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=logits, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
train_op = optimizer.minimize(loss_op)
predictions = tf.sigmoid(logits)
predicted_class = tf.greater(predictions, 0.5)
correct = tf.equal(predicted_class, tf.equal(Y,1.0))
accuracy = tf.reduce_mean( tf.cast(correct, 'float') )
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())
for epoch in range(1, num_epochs + 1):
batch_x, batch_y = get_batch(train_df, batch_size, epoch)
sess.run(train_op, feed_dict={X: batch_x, Y: batch_y})
if epoch % display_epoch == 0 or epoch == 1:
loss, acc , pred, fff= sess.run([loss_op, accuracy, predictions, logits],
feed_dict={X: batch_x,
Y: batch_y})
c = ', '.join('{}={}'.format(*t) for t in zip(pred, batch_y))
print("[{}] Batch loss={:.4f}, Accuracy={:.5f}, Logits vs labels= {}".format(epoch, loss, acc, c))
print("Optimization Finished!")
batch_x, batch_y = get_batch(test_df, batch_size, 1)
print("Testing Accuracy:", \
sess.run(accuracy, feed_dict={X: batch_x,
Y: batch_y}))
final output: 最终输出:
[1] Batch loss=3.2160, Accuracy=0.41000
[50] Batch loss=0.6661, Accuracy=0.61800
[100] Batch loss=0.6472, Accuracy=0.65200
[150] Batch loss=0.6538, Accuracy=0.64000
[200] Batch loss=0.6508, Accuracy=0.64400
Optimization Finished!
('Testing Accuracy:', 0.63999999)
In this case it is not a machine learning algorithm problem, but a bug in your data generation which is scrambling the relationship that you intend. 在这种情况下,这不是机器学习算法的问题,而是数据生成中的错误,该错误正在扰乱您想要的关系。 In this function: 在此功能中:
def generate_input_data(num_examples, num_features):
features = []
labels = []
for i in xrange(num_examples):
features.append(np.random.rand(num_features) * np.random.randint(1, 10) + np.random.rand(num_features))
if np.random.randint(101) > 90:
features[i-1][np.random.randint(num_features)] = 0
hard = ceil(np.sum(features[i-1])) % 2
easy = 0
if features[i-1][0] > 3:
easy = 1
labels.append(easy)
df = pd.concat(
[
pd.DataFrame(features),
pd.Series(labels).rename('labels')
],
axis=1,
)
return df
You are indexing features by i-1
to determine the label. 您正在按i-1
索引要素以确定标签。 However, xrange
will generate numbers starting from 0
, so you don't need to subtract the 1
. 但是, xrange
将从0
开始生成数字,因此您无需减去1
。 In fact, when you do, the relationship becomes close to random, and essentially unpredictable, so even though the rest of your model is OK, it won't be able to score well. 实际上,当您这样做时,这种关系变得接近随机,并且基本上是不可预测的,因此即使模型的其余部分正常,也无法获得良好的评分。
So you need to index by i
instead eg if features[i][0] > 3
. 因此,您需要改用i
进行索引,例如, if features[i][0] > 3
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.