简体   繁体   English

查找numpy.corr()的非线性函数的相关性

[英]Find the correlation of a non-linear function for numpy.corr()

I wrote a program that reads a CSV file and computes the correlation between the two columns. 我编写了一个程序,该程序读取CSV文件并计算两列之间的相关性。 The problem is the standard method of finding correlation does not work on curves and other non-linear functions. 问题是找到相关性的标准方法不适用于曲线和其他非线性函数。 Is there another function or an easy way to modify the data to determine correlation? 是否存在另一种功能或简单的方法来修改数据以确定相关性? Below is my code so far, the csv input, and the current output. 下面是到目前为止的代码,csv输入和当前输出。

def findCorrelation(csvFileName):
    data = pd.read_csv(csvFileName)
    data = data.values
    df = pd.DataFrame(data=data)
    npList = np.asarray(df)
    np2 = npList.astype(float)
    df2 = pd.DataFrame(data=np2)
    corr = df2.corr()
    corr = corr.values
    return corr[0][1]

def correlationMeaning(corr):
    if corr == 1:
        return ['perfect', 'positive', str(corr)]
    elif corr > 0.9:
        return ['high', 'positive', str(corr)]
    elif corr > 0.5:
        return ['medium', 'positive', str(corr)]
    elif corr > 0.1:
        return ['low', 'positive', str(corr)]
    elif corr > -0.1:
        return ['no', str(corr)]
    elif corr > -0.5:
        return ['low', 'negative', str(corr)]
    elif corr > -0.9:
        return ['medium', 'negative', str(corr)]
    elif corr > -1:
        return ['high', 'negative', str(corr)]
    elif corr == -1:
        return ['perfect', 'negative', str(corr)]
    else:
        return ['error']

print correlationMeaning(findCorrelation('CurveData.csv'))

CSV input: CSV输入:

Temp,Sales
30,50
34,52
38,54
42,56
46,58
50,60
54,62
58,62
62,60
66,58
70,56
74,54
78,52
82,50

Output: 输出:

['no', '0.0']

GENERAL ANSWER 一般答案

The correlation coefficient is, by definition, a linear fit. 根据定义,相关系数是线性拟合。 What I think you want is some manner of curve-fitting. 我想您想要的是某种形式的曲线拟合。 The problem is that you have to experiment, either by hand or programmatically, to find a good fit. 问题是您必须通过手工或编程方式进行实验才能找到合适的选择。

Also, these do not provide a direct cognate to the correlation coefficient, although the least-squares error can be easily harnessed to this purpose. 而且,尽管最小二乘误差可以很容易地用于此目的,但是它们也不提供相关系数的直接关联。

SPECIFIC APPLICATION 特殊应用

The given case is a simple "vee" shape; 给定的情况是简单的“ V形”形状。 you need a non-linear transformation on your independent variable (Temp) to get a nice fit: X <= abs(X-56) . 您需要对自变量(Temp)进行非线性变换才能很好地拟合: X <= abs(X-56) Now you have a perfect correlation. 现在您已经有了完美的关联。

If you want a program to experiment with various fits and derive the best one for each arbitrary data set, you'll have to program the exterior shell yourself, I'm afraid. 恐怕如果您想让程序尝试各种拟合并为每个任意数据集得出最佳拟合,就必须自己编写外壳程序。 However, there are a number of packages (such as SciKit) which provide function to optimize a set of equations with a given error function. 然而,也一些包(如SciKit),其提供的功能,以优化一组方程,用给定的误差函数。 If you want to tackle the larger project, you might want to research those facilities. 如果要处理较大的项目,则可能需要研究这些设施。

In the meantime, perhaps a simple plotting function would help you narrow the field for your specific needs? 同时,也许简单的绘图功能可以帮助您缩小范围以适应特定需求?

Try the correlation method using element-wise; 尝试使用基于元素的相关方法; go over all the elements of the curves and find the correlation values for each pair. 遍历曲线的所有元素并找到每对的相关值。 Then, you can average the values of the correlation to one value which will indicates if you are in High, Medium, Low or No correlation. 然后,您可以将相关值平均为一个值,该值将指示您是处于高,中,低还是无相关中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 非线性函数的numpy爱因斯坦求和 - Einstein summation in numpy for non-linear function NumPy 中的自定义非线性矩阵乘法 - Custom non-linear matrix multiplication in NumPy 如何在不相关的情况下生成变量之间的非线性依赖? - How to generate non-linear dependence between variables without correlation? 如何使用scipy / numpy或sympy执行非线性优化? - How to perform non-linear optimization with scipy/numpy or sympy? 如何找到 yx 非线性关系的最佳拟合函数 - How to find the best fitting function for a y-x non-linear relationship 如何转换或求解具有非线性目标 Function 和约束的非线性优化问题 - How to Convert or Solve a Non-Linear Optimization Problem with Non-Linear Objective Function & Constraints 使用 pyMCMC/pyMC 将非线性函数拟合到数据/观察值 - Fit a non-linear function to data/observations with pyMCMC/pyMC 使用 CP-SAT Solver 求解非线性目标函数 - Using CP-SAT Solver for non-linear objective function Gekko非线性优化,目标函数中的错误 - Gekko Non-Linear optimization, error in Objective function 用于建模 3-D 表面的两参数非线性 function - Two parameter non-linear function for modeling a 3-D surface
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM