[英]Pandas.DataFrame.apply returns None values
The input dataframe example as follows: 输入数据框示例如下:
y1 y2 y3 y4 y5 y6
2.3 2.8 2.9 2.8 2.3 2.2
2.9 3 3.1 2.9 2.8 3
1.7 2.2 2.1 2.1 1.7 1.8
2 2.2 2.1 2.1 1.9 2.1
I want to calculate each row linear regression, and run code: 我想计算每一行的线性回归,并运行代码:
import numpy as np
import pandas as pd
import scipy.stats as st
df=pd.read_excel(r'test.xlsx')
def lrg(y,p):
x=np.arange(1,7)
k,_,r,p,_=st.linregress(x,y) #return 5 element tupe, I choose 3 of them.
if p=='k':
return k
if p=='r':
return r
if p=='p':
return p
col=['y'+str(i) for i in range(1,7)]
df['r']=df[col].apply(lambda y:lrg(y,'r'),axis=1) # add values r as new column
Why the returned df 'r' column values is None? 为什么返回的df'r'列值为None?
df:
y1 y2 y3 y4 y5 y6 r
0 2.3 2.8 2.9 2.8 2.3 2.2 None
1 2.9 3.0 3.1 2.9 2.8 3.0 None
2 1.7 2.2 2.1 2.1 1.7 1.8 None
3 2.0 2.2 2.1 2.1 1.9 2.1 None
Its with the argument p
you are re assigning inside the function. 它与参数p
在函数内部重新分配。 So change the argument p
to something else. 因此,将参数p
更改为其他值。
def lrg(y,j):
x=np.arange(1,7)
k,_,r,p,_=st.linregress(x,y)
if j=='k':
return k
if j=='r':
return r
if j=='p':
return p
df['r'] = df[col].apply(lambda y: lrg(y,'r'),axis=1)
y1 y2 y3 y4 y5 y6 r 0 2.3 2.8 2.9 2.8 2.3 2.2 -0.356753 1 2.9 3.0 3.1 2.9 2.8 3.0 -0.152894 2 1.7 2.2 2.1 2.1 1.7 1.8 -0.237468 3 2.0 2.2 2.1 2.1 1.9 2.1 -0.207020
You're overwriting the value of p
inside the function. 您正在覆盖函数内部的p
值。
def lrg(y, p): # <---- here
x=np.arange(1,7)
k,_,r,p,_=st.linregress(x,y) # <---- p redefined
...
Change the name, and you should be good. 更改名称,您应该会很好。
You can use a dictionary lookup to consolidate your code a little. 您可以使用字典查找来稍微合并代码。
x = np.arange(1, 7)
def lrg(y, p):
k, _, r, p2, _ = st.linregress(x, y)
vals = {'k' : k, 'r' : r, 'p' : p2}
return vals.get(p, np.nan)
col = ['y' + str(i) for i in range(1,7)]
df['r'] = df[col].apply(lambda y: lrg(y, 'r'), axis=1)
df
y1 y2 y3 y4 y5 y6 r
0 2.3 2.8 2.9 2.8 2.3 2.2 -0.356753
1 2.9 3.0 3.1 2.9 2.8 3.0 -0.152894
2 1.7 2.2 2.1 2.1 1.7 1.8 -0.237468
3 2.0 2.2 2.1 2.1 1.9 2.1 -0.207020
you overwrite the variable p here: 您可以在此处覆盖变量p:
k,_,r,p,_=st.linregress(x,y) #return 5 element tupe, I choose 3 of them.
It is no longer has the value that was given to the function 它不再具有赋予该函数的值
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.