简体   繁体   English

Pandas 创建一个框架,其条目是应用于其他 DataFrame 的相应条目的 function 的值

[英]Pandas create a frame whose entries are the value of a function applied to the respective entries of other DataFrames

I have tree frames, lets say:我有树框,可以说:

ID    A  B   | ID    A B  | ID    A B
john  *  1   | john  # 2  | john  @ 3
paul  1  1   | paul  2 2  | paul  3 3
jones 1  1   | jones 2 2  | jones 3 3

and I have to create a new dataframe where each entry is the result of a function whose arguments are the respective entries of the three frames我必须创建一个新的 dataframe ,其中每个条目都是 function 的结果,其 arguments 是三个帧的相应条目

ID    A        B
john f(*,#,@) f(1,2,3)
...

I'm new to pandas and the only approach that I would know how to do is to turn the frames into numpy arrays and work on them like you would do on three matrices, but I would prefer to solve this problem the panda's way.我是 pandas 的新手,我知道该怎么做的唯一方法是将框架转换为numpy arrays 并像处理这三个矩阵一样处理它们,但我更喜欢用熊猫的方式解决这个问题。

I already tried looking for other questions on SO but couldn't find anything, it is possible that that's due to how I've formulated my question.我已经尝试寻找关于 SO 的其他问题,但找不到任何东西,这可能是由于我如何制定我的问题。

Not really sure exactly what is what you're doing, but here is something:不太确定你在做什么,但这里有一些东西:

# Define dummy function (f)
def f(x):
    # you can use here x.name, x.A, x.B 
    # >>> x.name
    # 'paul'
    # >>> x.A
    # ['1', '2', '3']
    # >>> X.B
    # [1, 2, 3]
    return x

>>> df1
      ID  A  B
0   john  *  1
1   paul  1  1
2  jones  1  1

>>> df2
      ID  A  B
0   john  #  2
1   paul  2  2
2  jones  2  2

>>> df3
      ID  A  B
0   john  @  3
1   paul  3  3
2  jones  3  3

>>> pd.concat([df1,df2,df3]).groupby('ID').agg(list).apply(f, axis=1)
               A          B
ID
john   [*, #, @]  [1, 2, 3]
jones  [1, 2, 3]  [1, 2, 3]
paul   [1, 2, 3]  [1, 2, 3]
import pandas as pd

If you have:如果你有:

df0=pd.DataFrame.from_dict({'ID':['john','paul'],'A':1,'B':2})
df1=pd.DataFrame.from_dict({'ID':['john','paul'],'A':3,'B':4})
df2=pd.DataFrame.from_dict({'ID':['john','paul'],'A':5,'B':6})

Merge these 3 dataframes: 合并这 3 个数据框:

merged=df0.merge(df1, on='ID').merge(df2, on='ID')
merged.columns=['ID','A0','B0','A1','B1','A2','B2']

Define an example function f :定义一个例子 function f

def f(a,b,c):
    return sum([a,b,c]) # for example

Create dataframe for result:为结果创建 dataframe:

result=pd.DataFrame()
result['ID']=merged['ID']

Calculate A and B column in the result dataframe by apply ing the f function defined above:通过应用上面定义的f function 计算result dataframe 中的AB列:

result['A']=merged.apply(lambda row: f(row['A0'],row['A1'],row['A2']),axis=1)
result['B']=merged.apply(lambda row: f(row['B0'],row['B1'],row['B2']),axis=1)

result will be: result将是:

     ID  A   B
0  john  9  12
1  paul  9  12

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM