简体   繁体   English

如何指定相关系数为loss keras中的loss function

[英]How to specify the correlation coefficient as the loss function in keras

I am using keras+tensorflow for the first time.我是第一次使用 keras+tensorflow。 I would like to specify the correlation coefficient as the loss function. It makes sense to square it so that it is a number between 0 and 1 where 0 is bad and 1 is good.我想将相关系数指定为损失 function。对它进行平方是有意义的,它是一个介于 0 和 1 之间的数字,其中 0 是坏的,1 是好的。

My basic code currently looks like:我的基本代码目前看起来像:

def baseline_model():
        model = Sequential()
        model.add(Dense(4000, input_dim=n**2, kernel_initializer='normal', activation='relu'))
        model.add(Dense(1, kernel_initializer='normal'))
        # Compile model
        model.compile(loss='mean_squared_error', optimizer='adam')
        return model

estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=baseline_model, epochs=100, batch_size=32, verbose=2)))
pipeline = Pipeline(estimators)
kfold = KFold(n_splits=10, random_state=0)
results = cross_val_score(pipeline, X, Y, cv=kfold)
print("Standardized: %.2f (%.2f) MSE" % (results.mean(), results.std()))

How can I change this so that it optimizes to minimize the squared correlation coefficient instead?我怎样才能改变它,以便它优化以最小化平方相关系数呢?


I tried the following:我尝试了以下内容:

def correlation_coefficient(y_true, y_pred):
    pearson_r, _ = tf.contrib.metrics.streaming_pearson_correlation(y_pred, y_true)
    return 1-pearson_r**2

def baseline_model():
# create model
        model = Sequential()
        model.add(Dense(4000, input_dim=n**2, kernel_initializer='normal', activation='relu'))
#        model.add(Dense(2000, kernel_initializer='normal', activation='relu'))
        model.add(Dense(1, kernel_initializer='normal'))
        # Compile model
        model.compile(loss=correlation_coefficient, optimizer='adam')
        return model

but this crashes with:但这崩溃了:

Traceback (most recent call last):
  File "deeplearning-det.py", line 67, in <module>
    results = cross_val_score(pipeline, X, Y, cv=kfold)
  File "/home/user/.local/lib/python3.5/site-packages/sklearn/model_selection/_validation.py", line 321, in cross_val_score
    pre_dispatch=pre_dispatch)
  File "/home/user/.local/lib/python3.5/site-packages/sklearn/model_selection/_validation.py", line 195, in cross_validate
    for train, test in cv.split(X, y, groups))
  File "/home/user/.local/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 779, in __call__
    while self.dispatch_one_batch(iterator):
  File "/home/user/.local/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 625, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/user/.local/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 588, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/user/.local/lib/python3.5/site-packages/sklearn/externals/joblib/_parallel_backends.py", line 111, in apply_async
    result = ImmediateResult(func)
  File "/home/user/.local/lib/python3.5/site-packages/sklearn/externals/joblib/_parallel_backends.py", line 332, in __init__
    self.results = batch()
  File "/home/user/.local/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 131, in __call__
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "/home/user/.local/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 131, in <listcomp>
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "/home/user/.local/lib/python3.5/site-packages/sklearn/model_selection/_validation.py", line 437, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "/home/user/.local/lib/python3.5/site-packages/sklearn/pipeline.py", line 259, in fit
    self._final_estimator.fit(Xt, y, **fit_params)
  File "/home/user/.local/lib/python3.5/site-packages/keras/wrappers/scikit_learn.py", line 147, in fit
    history = self.model.fit(x, y, **fit_args)
  File "/home/user/.local/lib/python3.5/site-packages/keras/models.py", line 867, in fit
    initial_epoch=initial_epoch)
  File "/home/user/.local/lib/python3.5/site-packages/keras/engine/training.py", line 1575, in fit
    self._make_train_function()
  File "/home/user/.local/lib/python3.5/site-packages/keras/engine/training.py", line 960, in _make_train_function
    loss=self.total_loss)
  File "/home/user/.local/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/home/user/.local/lib/python3.5/site-packages/keras/optimizers.py", line 432, in get_updates
    m_t = (self.beta_1 * m) + (1. - self.beta_1) * g
  File "/home/user/.local/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 856, in binary_op_wrapper
    y = ops.convert_to_tensor(y, dtype=x.dtype.base_dtype, name="y")
  File "/home/user/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 611, in convert_to_tensor
    as_ref=False)
  File "/home/user/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 676, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/home/user/.local/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py", line 121, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/home/user/.local/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py", line 102, in constant
    tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/home/user/.local/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py", line 364, in make_tensor_proto
    raise ValueError("None values not supported.")
ValueError: None values not supported.

Update 1更新 1

Following the answer below the code now runs.按照下面的答案,代码现在运行。 Unfortunately, the correlation_coefficient and correlation_coefficient_loss functions give different values from each other and I am not sure either of them is the same as you would get from 1- scipy.stats.pearsonr ()[0]**2.不幸的是, correlation_coefficientcorrelation_coefficient_loss函数给出了彼此不同的值,我不确定它们中的任何一个是否与您从1-scipy.stats.pearsonr ()[0]**2 中获得的值相同。

Why are loss functions giving the wrong outputs and how can they be corrected to give the same values as 1 - scipy.stats.pearsonr()[0]**2 would give?为什么损失函数会给出错误的输出,如何纠正它们以给出与1 - scipy.stats.pearsonr()[0]**2相同的值?

Here is the completely self contained code that should just run:这是应该运行的完全独立的代码:

import numpy as np
import sys
import math
from scipy.stats import ortho_group
from scipy.stats import pearsonr
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
import tensorflow as tf
from keras import backend as K


def permanent(M):
    n = M.shape[0]
    d = np.ones(n)
    j = 0
    s = 1
    f = np.arange(n)
    v = M.sum(axis=0)
    p = np.prod(v)
    while (j < n-1):
        v -= 2*d[j]*M[j]
        d[j] = -d[j]
        s = -s
        prod = np.prod(v)
        p += s*prod
        f[0] = 0
        f[j] = f[j+1]
        f[j+1] = j+1
        j = f[0]
    return p/2**(n-1)


def correlation_coefficient_loss(y_true, y_pred):
    x = y_true
    y = y_pred
    mx = K.mean(x)
    my = K.mean(y)
    xm, ym = x-mx, y-my
    r_num = K.sum(xm * ym)
    r_den = K.sum(K.sum(K.square(xm)) * K.sum(K.square(ym)))
    r = r_num / r_den
    return 1 - r**2


def correlation_coefficient(y_true, y_pred):
    pearson_r, update_op = tf.contrib.metrics.streaming_pearson_correlation(y_pred, y_true)
    # find all variables created for this metric
    metric_vars = [i for i in tf.local_variables() if 'correlation_coefficient' in i.name.split('/')[1]]

    # Add metric variables to GLOBAL_VARIABLES collection.
    # They will be initialized for new session.
    for v in metric_vars:
        tf.add_to_collection(tf.GraphKeys.GLOBAL_VARIABLES, v)

    # force to update metric values
    with tf.control_dependencies([update_op]):
        pearson_r = tf.identity(pearson_r)
        return 1-pearson_r**2


def baseline_model():
    # create model
    model = Sequential()
    model.add(Dense(4000, input_dim=no_rows**2, kernel_initializer='normal', activation='relu'))
#    model.add(Dense(2000, kernel_initializer='normal', activation='relu'))
    model.add(Dense(1, kernel_initializer='normal'))
    # Compile model
    model.compile(loss=correlation_coefficient_loss, optimizer='adam', metrics=[correlation_coefficient])
    return model


no_rows = 8

print("Making the input data using seed 7", file=sys.stderr)
np.random.seed(7)
U = ortho_group.rvs(no_rows**2)
U = U[:, :no_rows]
# U is a random orthogonal matrix
X = []
Y = []
print(U)
for i in range(40000):
        I = np.random.choice(no_rows**2, size = no_rows)
        A = U[I][np.lexsort(np.rot90(U[I]))]
        X.append(A.ravel())
        Y.append(-math.log(permanent(A)**2, 2))

X = np.array(X)
Y = np.array(Y)

estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=baseline_model, epochs=100, batch_size=32, verbose=2)))
pipeline = Pipeline(estimators)
X_train, X_test, y_train, y_test = train_test_split(X, Y,
                                                    train_size=0.75, test_size=0.25)
pipeline.fit(X_train, y_train)

Update 2更新 2

I have given up on the correlation_coefficient function and am now just using the correlation_coefficient_loss one as given by JulioDanielReyes below.我已经放弃了correlation_coefficient function,现在只使用下面 JulioDanielReyes 给出的correlation_coefficient_loss However, either this is still wrong or keras is dramatically overfitting.然而,要么这仍然是错误的,要么 keras 严重过度拟合。 Even when I have:即使我有:

def baseline_model():
        model = Sequential()
        model.add(Dense(40, input_dim=no_rows**2, kernel_initializer='normal', activation='relu'))
        model.add(Dense(1, kernel_initializer='normal'))
        model.compile(loss=correlation_coefficient_loss, optimizer='adam', metrics=[correlation_coefficient_loss])
        return model

I get a loss of, for example, 0.6653 after 100 epochs but 0.857 when I test the trained model.例如,我在 100 个纪元后损失了 0.6653,但在测试经过训练的 model 时损失了 0.857。

How can it be overfitting which such a tiny number of nodes in the hidden layer?隐藏层中这么少的节点怎么会过拟合呢?

According to keras documentation , you should pass the squared correlation coefficient as a function instead of the string 'mean_squared_error' .根据keras 文档,您应该将平方相关系数作为函数而不是字符串'mean_squared_error'

The function needs to receive 2 tensors (y_true, y_pred) .该函数需要接收 2 个张量(y_true, y_pred) You can look at keras source code for inspiration.您可以查看 keras 源代码以获得灵感。

There is also a function tf.contrib.metrics.streaming_pearson_correlation implemented on tensorflow.还有一个函数tf.contrib.metrics.streaming_pearson_correlation在 tensorflow 上实现。 Just be careful on the order of the parameters, it should be something like this:请注意参数的顺序,它应该是这样的:

Update 1: initialize local variables according to this issue更新1:根据这个问题初始化局部变量

import tensorflow as tf
def correlation_coefficient(y_true, y_pred):
    pearson_r, update_op = tf.contrib.metrics.streaming_pearson_correlation(y_pred, y_true, name='pearson_r'
    # find all variables created for this metric
    metric_vars = [i for i in tf.local_variables() if 'pearson_r'  in i.name.split('/')]

    # Add metric variables to GLOBAL_VARIABLES collection.
    # They will be initialized for new session.
    for v in metric_vars:
        tf.add_to_collection(tf.GraphKeys.GLOBAL_VARIABLES, v)

    # force to update metric values
    with tf.control_dependencies([update_op]):
        pearson_r = tf.identity(pearson_r)
        return 1-pearson_r**2

...

model.compile(loss=correlation_coefficient, optimizer='adam')

Update 2 : even though you cannot use the scipy function directly, you can look at the implementation and port it to your code using keras backend .更新 2 :即使您不能直接使用 scipy 函数,您也可以查看实现并使用keras backend 将其移植到您的代码中。

Update 3: The tensorflow function as it is may not be differentiable, your loss function needs to be something like this: (Please check the math)更新 3:张量流函数可能无法微分,您的损失函数需要是这样的:(请检查数学)

from keras import backend as K
def correlation_coefficient_loss(y_true, y_pred):
    x = y_true
    y = y_pred
    mx = K.mean(x)
    my = K.mean(y)
    xm, ym = x-mx, y-my
    r_num = K.sum(tf.multiply(xm,ym))
    r_den = K.sqrt(tf.multiply(K.sum(K.square(xm)), K.sum(K.square(ym))))
    r = r_num / r_den

    r = K.maximum(K.minimum(r, 1.0), -1.0)
    return 1 - K.square(r)

Update 4: The results are different on both functions, but correlation_coefficient_loss gives the same results as scipy.stats.pearsonr : Here is the code to test it:更新 4:两个函数的结果不同,但correlation_coefficient_loss给出与scipy.stats.pearsonr相同的结果:这是测试它的代码:

import tensorflow as tf
from keras import backend as K
import numpy as np
import scipy.stats

inputa = np.array([[3,1,2,3,4,5],
                    [1,2,3,4,5,6],
                    [1,2,3,4,5,6]])
inputb = np.array([[3,1,2,3,4,5],
                    [3,1,2,3,4,5],
                    [6,5,4,3,2,1]])

with tf.Session() as sess:
    a = tf.placeholder(tf.float32, shape=[None])
    b = tf.placeholder(tf.float32, shape=[None])
    f1 = correlation_coefficient(a, b)
    f2 = correlation_coefficient_loss(a, b)

    sess.run(tf.global_variables_initializer())

    for i in range(inputa.shape[0]):

        f1_result, f2_result = sess.run([f1, f2], feed_dict={a: inputa[i], b: inputb[i]})
        scipy_result =1- scipy.stats.pearsonr(inputa[i], inputb[i])[0]**2
        print("a: "+ str(inputa[i]) + " b: " + str(inputb[i]))
        print("correlation_coefficient: " + str(f1_result))
        print("correlation_coefficient_loss: " + str(f2_result))
        print("scipy.stats.pearsonr:" + str(scipy_result))

Results:结果:

a: [3 1 2 3 4 5] b: [3 1 2 3 4 5]
correlation_coefficient: -2.38419e-07
correlation_coefficient_loss: 0.0
scipy.stats.pearsonr:0.0
a: [1 2 3 4 5 6] b: [3 1 2 3 4 5]
correlation_coefficient: 0.292036
correlation_coefficient_loss: 0.428571
scipy.stats.pearsonr:0.428571428571
a: [1 2 3 4 5 6] b: [6 5 4 3 2 1]
correlation_coefficient: 0.994918
correlation_coefficient_loss: 0.0
scipy.stats.pearsonr:0.0

The following code is an implementation of correlation coefficient in tensorflow version 2.0以下代码是tensorflow 2.0版本中相关系数的实现

import tensorflow as tf

def correlation(x, y):    
    mx = tf.math.reduce_mean(x)
    my = tf.math.reduce_mean(y)
    xm, ym = x-mx, y-my
    r_num = tf.math.reduce_mean(tf.multiply(xm,ym))        
    r_den = tf.math.reduce_std(xm) * tf.math.reduce_std(ym)
    return = r_num / r_den

It returns the same result as numpy's corrcoef function.它返回与 numpy 的corrcoef函数相同的结果。

@Trifon's answer is correct if you have all your data available at the same time.如果您同时拥有所有数据,@Trifon 的回答是正确的。 The below code implements Pearson Correlation as a Keras metric which allows you to get the metric using batch inputs as is typically done during DNN training/eval:下面的代码将 Pearson Correlation 实现为 Keras 指标,它允许您使用批量输入获取指标,这通常是在 DNN 训练/评估期间完成的:

class PearsonCorrelation(tf.keras.metrics.Metric):

    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.cov = tf.metrics.Sum()
        self.sq_yt = tf.metrics.Sum()
        self.sq_yp = tf.metrics.Sum()
        self.mean_yp = tf.metrics.Mean()
        self.mean_yt = tf.metrics.Mean()
        self.count = tf.metrics.Sum()

    def update_state(self, y_true, y_pred, ):
        ''' Note y_pred are one-hot predictions, not probs/scores '''
        self.cov(y_true * y_pred)
        self.sq_yp(y_pred**2)
        self.sq_yt(y_true**2)
        self.mean_yp(y_pred)
        self.mean_yt(y_true)
        self.count(tf.reduce_sum(tf.shape(y_true)))

    def result(self):
        count = self.count.result()
        mean_yp = self.mean_yp.result()
        mean_yt = self.mean_yt.result()
        numerator = (self.cov.result() - count * self.mean_yp.result() * self.mean_yt.result())
        denominator = tf.sqrt(self.sq_yp.result() - count * mean_yp**2) * \
                      tf.sqrt(self.sq_yt.result() - count * mean_yt**2)
        return numerator / denominator

    def reset_states(self):
        self.cov.reset_states()
        self.sq_yt.reset_states()
        self.sq_yp.reset_states()
        self.mean_yp.reset_states()
        self.mean_yt.reset_states()
        self.count.reset_states()
r = scipy.stats.pearsonr(inputa[i], inputb[i])[0] 

r is the correlation, so why did you take the square over r ? r是相关性,那么你为什么要对r取平方?

scipy_result = 1 - scipy.stats.pearsonr(inputa[i], inputb[i])[0]**2

what is the relation between r and scipy_result ? rscipy_result之间的关系是什么?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM