[英]Using more than one argument to 'apply' in Pandas dataframe
Here is the code: 这是代码:
import pandas as pd
import numpy as np
a = pd.Series(['1a', '2a'])
b = pd.Series(['1b', '2b'])
c = pd.Series([2, 3])
df = pd.concat((a.rename('a'), b.rename('b'), c.rename('c')), axis=1)
def ar(a, b, c):
arr = pd.DataFrame(np.diag(np.arange(c)))
arr['a'] = a
arr['b'] = b
return arr
How can apply method be used to generate: 如何使用apply方法生成:
ab 0 1 2 --------------------- 1a 1b 0 0 NaN 1a 1b 0 1 NaN 2a 2b 0 0 0 2a 2b 0 1 0 2a 2b 0 0 1
...something like df.c.apply(ar, df.a, df.b)
does not work. ...像
df.c.apply(ar, df.a, df.b)
类的东西不起作用。 Thanks 谢谢
One quick and simple way is to "vectorize" your function by using np.vectorize
, allowing numpy to "hide" the loop (a lot like apply
, but with less overhead). 一种快速而简单的方法是使用
np.vectorize
进行“向量化”,使numpy可以“隐藏”循环(非常类似于apply
,但开销较小)。
v = np.vectorize(ar)
pd.concat(v(df.a, df.b, df.c))
0 1 a b 2
0 0 0 1a 1b NaN
1 0 1 1a 1b NaN
0 0 0 2a 2b 0.0
1 0 1 2a 2b 0.0
2 0 0 2a 2b 2.0
vectorize
takes as input, a function that operates on scalars, and allows you to pass vectors which are operated upon element-wise. vectorize
以标量运算的函数作为输入,并允许您传递按元素运算的向量。
This is similar to looping over a zip
ped version of your input and calling ar
at each iteration - 这类似于循环遍历输入的
zip
版本并在每次迭代时调用ar
r = []
for x, y, z in zip(df.a, df.b, df.c):
r.append(ar(x, y, z))
pd.concat(r)
0 1 a b 2
0 0 0 1a 1b NaN
1 0 1 1a 1b NaN
0 0 0 2a 2b 0.0
1 0 1 2a 2b 0.0
2 0 0 2a 2b 2.0
One can use map
function here which is very similar to apply
for dataframes: 可以在此处使用
map
功能,该功能与apply
数据框非常相似:
outlist = list(map(lambda x,y,z: ar(x,y,z), a,b,c))
outdf = pd.concat(list(map(pd.DataFrame, outlist)))
# or: outdf = pd.concat([pd.DataFrame(out[0]), pd.DataFrame(out[1])])
print(outdf)
Output: 输出:
0 1 a b 2
0 0 0 1a 1b NaN
1 0 1 1a 1b NaN
0 0 0 2a 2b 0.0
1 0 1 2a 2b 0.0
2 0 0 2a 2b 2.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.