在Pandas数据框中使用多个参数进行“应用”

Question

Here is the code: 这是代码：

import pandas as pd
import numpy as np

a = pd.Series(['1a', '2a'])
b = pd.Series(['1b', '2b'])
c = pd.Series([2, 3])

df = pd.concat((a.rename('a'), b.rename('b'), c.rename('c')), axis=1)

def ar(a, b, c):
    arr = pd.DataFrame(np.diag(np.arange(c)))
    arr['a'] = a
    arr['b'] = b

    return arr

How can apply method be used to generate: 如何使用apply方法生成：

 ab 0 1 2 --------------------- 1a 1b 0 0 NaN 1a 1b 0 1 NaN 2a 2b 0 0 0 2a 2b 0 1 0 2a 2b 0 0 1

...something like df.c.apply(ar, df.a, df.b) does not work. ...像df.c.apply(ar, df.a, df.b)类的东西不起作用。 Thanks 谢谢

Answer 1

One quick and simple way is to "vectorize" your function by using np.vectorize , allowing numpy to "hide" the loop (a lot like apply , but with less overhead). 一种快速而简单的方法是使用np.vectorize进行“向量化”，使numpy可以“隐藏”循环（非常类似于apply ，但开销较小）。

v = np.vectorize(ar)
pd.concat(v(df.a, df.b, df.c))

   0  1   a   b    2
0  0  0  1a  1b  NaN
1  0  1  1a  1b  NaN
0  0  0  2a  2b  0.0
1  0  1  2a  2b  0.0
2  0  0  2a  2b  2.0

vectorize takes as input, a function that operates on scalars, and allows you to pass vectors which are operated upon element-wise. vectorize以标量运算的函数作为输入，并允许您传递按元素运算的向量。

This is similar to looping over a zip ped version of your input and calling ar at each iteration - 这类似于循环遍历输入的zip版本并在每次迭代时调用ar

r = []
for x, y, z in zip(df.a, df.b, df.c):
    r.append(ar(x, y, z))

pd.concat(r)

   0  1   a   b    2
0  0  0  1a  1b  NaN
1  0  1  1a  1b  NaN
0  0  0  2a  2b  0.0
1  0  1  2a  2b  0.0
2  0  0  2a  2b  2.0

Answer 2

One can use map function here which is very similar to apply for dataframes: 可以在此处使用map功能，该功能与apply数据框非常相似：

outlist = list(map(lambda x,y,z: ar(x,y,z), a,b,c))
outdf = pd.concat(list(map(pd.DataFrame, outlist))) 
# or: outdf = pd.concat([pd.DataFrame(out[0]), pd.DataFrame(out[1])])
print(outdf)

Output: 输出：

   0  1   a   b    2
0  0  0  1a  1b  NaN
1  0  1  1a  1b  NaN
0  0  0  2a  2b  0.0
1  0  1  2a  2b  0.0
2  0  0  2a  2b  2.0

在Pandas数据框中使用多个参数进行“应用”

问题描述

2 个解决方案

解决方案1
1 已采纳 2018-01-10 04:04:40

解决方案2
1 2018-01-10 04:35:39

在Pandas数据框中使用多个参数进行“应用”

问题描述

2 个解决方案

解决方案1 1 已采纳 2018-01-10 04:04:40

解决方案2 1 2018-01-10 04:35:39

解决方案1
1 已采纳 2018-01-10 04:04:40

解决方案2
1 2018-01-10 04:35:39