[英]Python PANDAS: Applying a function to a dataframe, with arguments defined within dataframe
我有一个标头为“ Category”,“ Factor1”,“ Factor2”,“ Factor3”,“ Factor4”,“ UseFactorA”,“ UseFactorB”的数据框。
“ UseFactorA”和“ UseFactorB”的值是字符串['Factor1','Factor2','Factor3','Factor4']之一,基于“类别”中的值进行键控。
我想生成一列“结果”,该列等于dataframe [UseFactorA] / dataframe [UseFactorB]
以以下数据框为例:
[Category] [Factor1] [Factor2] [Factor3] [Factor4] [useFactor1] [useFactor2]
A 1 2 5 8 'Factor1' 'Factor3'
B 2 7 4 2 'Factor3' 'Factor1'
“结果”系列应为[2,.2]
但是,我无法弄清楚如何将useFactor1和useFactor2的值提供给索引以实现此目的-如果要使用的列是固定的,我只会给出
df['Result'] = df['Factor1']/df['Factor2']
但是,当我尝试给予
df['Results'] = df[df['useFactorA']]/df[df['useFactorB']]
我得到错误
ValueError: Wrong number of items passed 3842, placement implies 1
有没有办法在这里做我想做的事情?
可能不是最漂亮的解决方案(由于迭代),但是想到的是迭代一系列因素并在每个索引处设置“结果”值:
for i, factors in df[['UseFactorA', 'UseFactorB']].iterrows():
df.loc[i, 'Result'] = df[factors['UseFactorA']] / df[factors['UseFactorB']]
编辑:
另外一个选项:
def factor_calc_for_row(row):
factorA = row['UseFactorA']
factorB = row['UseFactorB']
return row[factorA] / row[factorB]
df['Result'] = df.apply(factor_calc_for_row, axis=1)
这是一个班轮:
df['Results'] = [df[df['UseFactorA'][x]][x]/df[df['UseFactorB'][x]][x] for x in range(len(df))]
它是如何工作的:
df[df['UseFactorA']]
返回一个数据帧,
df[df['UseFactorA'][x]]
返回系列
df[df['UseFactorA'][x]][x]
从系列中提取单个值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.