[英]Python 3 lambda error: The truth value of a Series is ambiguous
I am getting this error: The truth value of a Series is ambiguous
in my lambda function. 我收到此错误:在我的lambda函数中, The truth value of a Series is ambiguous
。 I know that here is a very comprehensive explanation around this error but I don't think this relates to my issue: Truth value of a Series is ambiguous. 我知道这是关于此错误的非常全面的解释,但我认为这与我的问题无关: 系列的真值不明确。 Use a.empty, a.bool(), a.item(), a.any() or a.all() 使用a.empty,a.bool(),a.item(),a.any()或a.all()
Basically I am trying to determine via lambda whether OpenBal
is the same from one month to the next within the same AccountID and give me a '1' if it is the same (eg for OpenBal=101 below). 基本上,我试图通过lambda确定在同一AccountID中一个月至下个月OpenBal
是否相同,如果相同则给我一个“ 1”(例如,下面的OpenBal = 101)。 Obviously the first record should give me a NaN. 显然,第一张唱片应该给我NaN。 (PS thanks @jdehesa for your answers in my other post). (PS感谢@jdehesa在我的其他帖子中的回答)。
This demonstrates my problem: 这证明了我的问题:
import pandas as pd
df = pd.DataFrame({'AccountID': [1,1,1,1,2,2,2,2,2],
'RefMonth': [1,2,3,4,1,2,3,4,5],
'OpenBal': [100,101,101,103,200,201,202,203,204]})
SameBal = df.groupby('AccountID').apply(lambda g: 1 if g['OpenBal'].diff() == 0 else 0)
df['SameBal'] = SameBal.sortlevel(1).values
1 if g['OpenBal'].diff() == 0
is not working. 1 if g['OpenBal'].diff() == 0
不起作用,则为1。 This is not how the pd.Series()
object can operate 这不是pd.Series()
对象可以操作的方式
You need to create a suitable method: 您需要创建一个合适的方法:
def convert(a):
return np.array([1 if i==0 else np.nan if pd.isnull(i) else 0 for i in a])
This will solve your The truth value of a Series is ambiguous
error 这将解决您The truth value of a Series is ambiguous
错误
SameBal = df.groupby('AccountID').apply(lambda g: pd.Series(data=convert(g['OpenBal'].diff().values), index=g['RefMonth']))
SameBal.name = 'SameBal'
SameBal
Out[]:
AccountID RefMonth
1 1 NaN
2 0.0
3 1.0
4 0.0
2 1 NaN
2 0.0
3 0.0
4 0.0
5 0.0
df.merge(SameBal.reset_index())
Out[]:
AccountID OpenBal RefMonth SameBal
0 1 100 1 NaN
1 1 101 2 0.0
2 1 101 3 1.0
3 1 103 4 0.0
4 2 200 1 NaN
5 2 201 2 0.0
6 2 202 3 0.0
7 2 203 4 0.0
8 2 204 5 0.0
Your error correctly indicates you can't check the truthness of a series. 您的错误正确表明您无法检查系列的真实性。 But custom anonymous functions are not necessary for this task. 但是自定义匿名功能对于此任务不是必需的。
Using groupby
+ transform
with pd.Series.diff
: 对pd.Series.diff
使用groupby
+ transform
:
import pandas as pd
df = pd.DataFrame({'AccountID': [1,1,1,1,2,2,2,2,2],
'RefMonth': [1,2,3,4,1,2,3,4,5],
'OpenBal': [100,101,101,103,200,201,202,203,204]})
df['A'] = (df.groupby('AccountID')['OpenBal'].transform(pd.Series.diff)==0).astype(int)
print(df)
AccountID OpenBal RefMonth A
0 1 100 1 0
1 1 101 2 0
2 1 101 3 1
3 1 103 4 0
4 2 200 1 0
5 2 201 2 0
6 2 202 3 0
7 2 203 4 0
8 2 204 5 0
If you need NaN
for the first row of each group: 如果每个组的第一行都需要NaN
:
g = df.groupby('AccountID')['OpenBal'].transform(pd.Series.diff)
df['A'] = (g == 0).astype(int)
df.loc[g.isnull(), 'A'] = np.nan
print(df)
AccountID OpenBal RefMonth A
0 1 100 1 NaN
1 1 101 2 0.0
2 1 101 3 1.0
3 1 103 4 0.0
4 2 200 1 NaN
5 2 201 2 0.0
6 2 202 3 0.0
7 2 203 4 0.0
8 2 204 5 0.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.