繁体   English   中英

使用2个数据框列作为参数应用函数

[英]Applying a function using 2 dataframe columns as arguments

我想应用一个基于其他2列中的变量创建一个列的函数。

  1. 一列'SSPstaterank'返回郊区排名。

  2. 第二列'SSPstaterank%'返回郊区排名的百分位。

我以为这段代码可以用,但是返回:

TypeError :(“'DataFrame'对象不可调用”,“发生在索引0”)

def func1 (a,b):
    if a == 1:
        return 'the #1 suburb'
    elif b >= 0.95:
        return 'ranked top 5% of suburbs'
    elif b >= 0.9:
        return 'ranked top 10% of suburbs'
    else:
        return 'none'

df2['rankdescript'] = df2.apply(lambda x: df2(x['SSPstaterank'], x['SSPstaterank%']), axis=1)

使用func1代替df2

df2['rankdescript'] = df2.apply(lambda x: func1(x['SSPstaterank'],x['SSPstaterank%']), axis=1)

使用numpy.select另一个解决方案应该更快:

df2 = pd.DataFrame({'SSPstaterank':[2,1,2,2,7],
                    'SSPstaterank%':[.99,.93,.93,.98,.23]})


m1 = df2['SSPstaterank'] == 1
m2 = df2['SSPstaterank%'] >= 0.95
m3 = df2['SSPstaterank%'] >= 0.9

masks = [m1, m2, m3]
vals = ['the #1 suburb','ranked top 5% of suburbs','ranked top 10% of suburbs']

df2['rankdescript'] = np.select(masks, vals, default='not matched')
print (df2)
   SSPstaterank  SSPstaterank%               rankdescript
0             2           0.99   ranked top 5% of suburbs
1             1           0.93              the #1 suburb
2             2           0.93  ranked top 10% of suburbs
3             2           0.98   ranked top 5% of suburbs
4             7           0.23                not matched

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM