简体   繁体   English

神经网络似乎无法学习简单的关系TensorFlow

[英]Neural network can't seem to learn a simple relationship TensorFlow

I'm experimenting with TensorFlow (which seems amazing so far!) and I'm playing around with a toy example a 1 class classification problem. 我正在尝试TensorFlow(到目前为止看起来还不错!),并且正在玩一个玩具示例1类分类问题。 I'm generating some features and if the first feature is above a threshold then the example is "positive" 我正在生成一些特征,如果第一个特征高于阈值,则该示例为“正”

Full code here: https://gist.github.com/tnbredillet/f136c2bc40815517e0aa1139bd2060ee 完整代码在这里: https : //gist.github.com/tnbredillet/f136c2bc40815517e0aa1139bd2060ee

The problem is that it seems that the model is unable to capture that simple relationship. 问题在于,该模型似乎无法捕获这种简单的关系。 Of course I'm missing a lot of stuff (CV, regularization, batch normalization, hyperparameter tuning) to name a few. 当然,我缺少很多东西(CV,正则化,批处理归一化,超参数调整),仅举几例。 But still I would expect the model to manage to figure that one out right ? 但是我仍然希望模型能够设法解决这个问题,对吗? Maybe there's simply a bug in my code? 也许我的代码中只是一个错误?

Would welcome any insights :-) 欢迎任何见解:-)

EDIT: 编辑:

Data generating code: 数据生成代码:

num_examples = 100000
split = 0.2
num_features = 1


def generate_input_data(num_examples, num_features):
    features = []
    labels = []
    for i in xrange(num_examples):
        features.append(np.random.rand(num_features) * np.random.randint(1, 10) + np.random.rand(num_features))
    if np.random.randint(101) > 90:
        features[i-1][np.random.randint(num_features)] = 0

    hard = ceil(np.sum(features[i-1])) % 2
    easy = 0
    if features[i-1][0] > 3:
        easy = 1
    labels.append(easy)

    df = pd.concat(
    [
        pd.DataFrame(features),
        pd.Series(labels).rename('labels')
    ],
    axis=1,
    )
    return df


def one_hot_encoding(train_df):
    #TODO: handle categorical feature one hot encoding.
    return 0, 0


def scale_data(train_df, test_df):
    categorical_columns, encoding = one_hot_encoding(train_df)

    scaler = MinMaxScaler(feature_range=(0,1))

    scaler.fit(train_df.drop(['labels'], axis=1))

    train_df = pd.concat(
        [
            pd.DataFrame(scaler.transform(train_df.drop('labels', axis=1))),
            train_df['labels']
        ],
        axis=1,
    )
    test_df = pd.concat(
        [
        pd.DataFrame(scaler.transform(test_df.drop('labels', axis=1))),
        test_df['labels']
        ],
        axis=1,
    )

    return train_df, test_df


def preprocess_data(train_df, test_df):
    all_dfs = [train_df, test_df]
    features = set()
    for df in all_dfs:
        features |= set(df.columns)

    for df in all_dfs:
        for f in features:
            if f not in df.columns:
                df[f] = 0.0

    for df in all_dfs:
        df.sort_index(axis=1, inplace=True)

    train_df, test_df = scale_data(train_df, test_df)


    train_df = shuffle(train_df).reset_index(drop=True)

    return train_df, test_df


def get_data(num_examples, split):
    train_df = generate_input_data(num_examples, num_features)
    test_df = generate_input_data(int(ceil(num_examples*split)), num_features)
    return preprocess_data(train_df, test_df)

def get_batch(df, batch_size, epoch):
    start = batch_size*epoch-batch_size
    end = batch_size*epoch
    if end > len(df):
        end = len(df)
    size = end - start       
    batch_x = df.drop('labels', axis=1)[start:end].as_matrix()
    batch_y = df['labels'][start:end].as_matrix().reshape(size, 1)
    return batch_x, batch_y

And the network definition/training and evaluation: 以及网络定义/培训和评估:

train_df, test_df = get_data(num_examples, split)

n_hidden_1 = 8
n_hidden_2 = 4
learning_rate = 0.01
batch_size = 500
num_epochs = 200
display_epoch = 50

def neural_net(x):
    layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
    layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
    out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
    return out_layer

weights = {
    'h1': tf.Variable(tf.random_normal([num_features, n_hidden_1])),
    'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
    'out': tf.Variable(tf.random_normal([n_hidden_2, 1]))
}
biases = {
    'b1': tf.Variable(tf.random_normal([n_hidden_1])),
    'b2': tf.Variable(tf.random_normal([n_hidden_2])),
    'out': tf.Variable(tf.random_normal([1]))
}

X = tf.placeholder(tf.float32, shape=(None, num_features))
Y = tf.placeholder(tf.float32, shape=(None, 1))    

logits = neural_net(X)

loss_op =         tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=logits, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
train_op = optimizer.minimize(loss_op)

predictions = tf.sigmoid(logits)
predicted_class = tf.greater(predictions, 0.5)
correct = tf.equal(predicted_class, tf.equal(Y,1.0))
accuracy = tf.reduce_mean( tf.cast(correct, 'float') )

with tf.Session() as sess:

    sess.run(tf.global_variables_initializer())
    sess.run(tf.local_variables_initializer())

    for epoch in range(1, num_epochs + 1):
        batch_x, batch_y = get_batch(train_df, batch_size, epoch)
        sess.run(train_op, feed_dict={X: batch_x, Y: batch_y})
        if epoch % display_epoch == 0 or epoch == 1:
            loss, acc , pred, fff= sess.run([loss_op, accuracy, predictions, logits],
                                           feed_dict={X: batch_x,
                                                      Y: batch_y})
            c = ', '.join('{}={}'.format(*t) for t in zip(pred, batch_y))
            print("[{}] Batch loss={:.4f}, Accuracy={:.5f}, Logits vs labels= {}".format(epoch, loss, acc, c))


    print("Optimization Finished!")

    batch_x, batch_y = get_batch(test_df, batch_size, 1)
    print("Testing Accuracy:", \
    sess.run(accuracy, feed_dict={X: batch_x,
                                  Y: batch_y}))

final output: 最终输出:

[1] Batch loss=3.2160, Accuracy=0.41000
[50] Batch loss=0.6661, Accuracy=0.61800
[100] Batch loss=0.6472, Accuracy=0.65200
[150] Batch loss=0.6538, Accuracy=0.64000
[200] Batch loss=0.6508, Accuracy=0.64400
Optimization Finished!
('Testing Accuracy:', 0.63999999)

In this case it is not a machine learning algorithm problem, but a bug in your data generation which is scrambling the relationship that you intend. 在这种情况下,这不是机器学习算法的问题,而是数据生成中的错误,该错误正在扰乱您想要的关系。 In this function: 在此功能中:

def generate_input_data(num_examples, num_features):
    features = []
    labels = []
    for i in xrange(num_examples):
        features.append(np.random.rand(num_features) * np.random.randint(1, 10) + np.random.rand(num_features))
    if np.random.randint(101) > 90:
        features[i-1][np.random.randint(num_features)] = 0

    hard = ceil(np.sum(features[i-1])) % 2
    easy = 0
    if features[i-1][0] > 3:
        easy = 1
    labels.append(easy)

    df = pd.concat(
    [
        pd.DataFrame(features),
        pd.Series(labels).rename('labels')
    ],
    axis=1,
    )
    return df

You are indexing features by i-1 to determine the label. 您正在按i-1索引要素以确定标签。 However, xrange will generate numbers starting from 0 , so you don't need to subtract the 1 . 但是, xrange将从0开始生成数字,因此您无需减去1 In fact, when you do, the relationship becomes close to random, and essentially unpredictable, so even though the rest of your model is OK, it won't be able to score well. 实际上,当您这样做时,这种关系变得接近随机,并且基本上是不可预测的,因此即使模型的其余部分正常,也无法获得良好的评分。

So you need to index by i instead eg if features[i][0] > 3 . 因此,您需要改用i进行索引,例如, if features[i][0] > 3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM