[英]How convert this Pytorch loss function to Tensorflow?
我阅读的这篇论文的代码有一个使用 Pytorch 编写的损失函数,我试图尽可能地对其进行转换,但我将所有零作为模型预测,所以想问以下问题:
这是功能:
#Pytorch
class AdjMSELoss1(nn.Module):
def __init__(self):
super(AdjMSELoss1, self).__init__()
def forward(self, outputs, labels):
outputs = torch.squeeze(outputs)
alpha = 2
loss = (outputs - labels)**2
adj = torch.mul(outputs, labels)
adj[adj>0] = 1 / alpha
adj[adj<0] = alpha
loss = loss * adj
return torch.mean(loss)
#Tensorflow
def custom_loss_function(outputs,labels):
outputs = tf.squeeze(outputs)
alpha = 2.0
loss = (outputs - labels) ** 2.0
adj = tf.math.multiply(outputs,labels)
adj = tf.where(tf.greater(adj, 0.0), tf.constant(1/alpha), adj)
adj = tf.where(tf.less(adj, 0.0), tf.constant(alpha), adj)
loss = loss * adj
return tf.reduce_mean(loss)
该函数编译正确并用于损失和指标参数,它在指标日志中输出结果似乎正确(类似于 val_loss) ,但运行后模型的输出只是预测全 0
model.compile(
loss= custom_loss_function,
optimizer=optimization,
metrics = [custom_loss_function]
)
模型
#Simplified for readability
model = Sequential()
model.add(LSTM(32,input_shape=(SEQ_LEN,feature_number),return_sequences=True,))
model.add(Dropout(0.3))
model.add(LSTM(96, return_sequences = False))
model.add(Dropout(0.3))
model.add(Dense(1))
return model
输入/特征是前 SEQ_LEN 天的 pct_change 价格。 (给定 SEQ_LEN 天数试图预测第二天:目标)
输出/目标是第二天的价格 pct_change * 100(例如:5% 表示 5)。 (每行 1 个值)
注意:当 RMSE() 设置为损失函数时,模型会正常预测,正如上面使用 custom_loss_function 时提到的那样,它只是预测零
试试这个custom_loss
:
def custom_loss(y_pred, y_true):
alpha = 2.0
loss = (y_pred - y_true) ** 2.0
adj = tf.math.multiply(y_pred,y_true)
adj = tf.where(tf.greater(adj, 0.0), tf.constant(1/alpha), adj)
adj = tf.where(tf.less(adj, 0.0), tf.constant(alpha), adj)
loss = loss * adj
return tf.reduce_mean(loss)
我检查以下代码并正常工作(用于创建模型以使用custom_loss
学习和预测两个变量之和的代码) :
from keras.models import Sequential
from keras.layers import Dense
import tensorflow as tf
import numpy as np
x = np.random.rand(1000,2)
y = x.sum(axis=1)
y = y.reshape(-1,1)
def custom_loss(y_pred, y_true):
alpha = 2.0
loss = (y_pred - y_true) ** 2.0
adj = tf.math.multiply(y_pred,y_true)
adj = tf.where(tf.greater(adj, 0.0), tf.constant(1/alpha), adj)
adj = tf.where(tf.less(adj, 0.0), tf.constant(alpha), adj)
loss = loss * adj
return tf.reduce_mean(loss)
model = Sequential()
model.add(Dense(128, activation='relu', input_dim=2))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(1,))
model.compile(optimizer='adam', loss=custom_loss)
model.fit(x, y, epochs=200, batch_size=16)
for _ in range(10):
rnd_num = np.random.randint(50, size=2)[None, :]
pred_add = model.predict(rnd_num)
print(f'predict sum of {rnd_num[0]} -> {pred_add}')
输出:
Epoch 1/200
63/63 [==============================] - 1s 2ms/step - loss: 0.2903
Epoch 2/200
63/63 [==============================] - 0s 2ms/step - loss: 0.0084
Epoch 3/200
63/63 [==============================] - 0s 2ms/step - loss: 0.0016
...
Epoch 198/200
63/63 [==============================] - 0s 2ms/step - loss: 3.3231e-07
Epoch 199/200
63/63 [==============================] - 0s 2ms/step - loss: 5.1004e-07
Epoch 200/200
63/63 [==============================] - 0s 2ms/step - loss: 9.8688e-08
predict sum of [43 44] -> [[82.81973]]
predict sum of [39 13] -> [[48.97299]]
predict sum of [36 46] -> [[78.05187]]
predict sum of [46 7] -> [[49.445843]]
predict sum of [35 11] -> [[43.311478]]
predict sum of [33 1] -> [[31.695848]]
predict sum of [6 8] -> [[13.433815]]
predict sum of [14 38] -> [[49.54941]]
predict sum of [ 1 40] -> [[39.709686]]
predict sum of [10 2] -> [[11.325197]]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.