简体   繁体   English

基于另一个 dataframe 的行值对一个 dataframe 中的列求和

[英]Sum column in one dataframe based on row value of another dataframe

Say, I have one data frame df:说,我有一个数据框df:

   a  b   c  d   e
0  1  2  dd  5  Col1
1  2  3  ee  9  Col2
2  3  4  ff  1  Col4

There's another dataframe df2:还有另一个 dataframe df2:

  Col1   Col2   Col3 
0  1      2       4      
1  2      3       5      
2  3      4       6  

I need to add a column sum in the first dataframe, wherein it sums values of columns in the second dataframe df2 , based on values of column e in df1 .我需要在第一个 dataframe 中添加一个列sum ,其中它根据df1e列的值对第二个 dataframe df2中的列值求和。

Expected output预期 output

   a  b   c  d   e     Sum
0  1  2  dd  5  Col1    6
1  2  3  ee  9  Col2    9
2  3  4  ff  1  Col4    0

The Sum value in the last row is 0 because Col4 doesn't exist in df2.最后一行的Sum值为 0,因为 Col4 在 df2 中不存在。

What I tried: Writing some lamdas, apply function.我尝试了什么:写一些 lamdas,应用 function。 Wasn't able to do it.没能做到。 I'd greatly appreciate the help.我非常感谢您的帮助。 Thank you.谢谢你。

Try尝试

df['Sum']=df.e.map(df2.sum()).fillna(0)
df
Out[89]: 
   a  b   c  d     e  Sum
0  1  2  dd  5  Col1  6.0
1  2  3  ee  9  Col2  9.0
2  3  4  ff  1  Col4  0.0

Try this.尝试这个。 The following solution sums all values for a particular column if present in df2 using apply method and returns 0 if no such column exists in df2 .以下解决方案使用apply方法对df2中存在的特定列的所有值求和,如果df2中不存在此类列,则返回0

df1.loc[:,"sum"]=df1.loc[:,"e"].apply(lambda x: df2.loc[:,x].sum() if(x in df2.columns)  else 0)

Use .iterrows() to iterate through a data frame pulling out the values for each row as well as index.使用.iterrows()遍历数据框,提取每行的值以及索引。

A nest for loop style of iteration can be used to grab needed values from the second dataframe and apply them to the first嵌套循环样式的迭代可用于从第二个 dataframe 获取所需的值并将它们应用于第一个

import pandas as pd

df1 = pd.DataFrame(data={'a': [1,2,3], 'b': [2,3,4], 'c': ['dd', 'ee', 'ff'], 'd': [5,9,1], 'e': ['Col1','Col2','Col3']})
df2 = pd.DataFrame(data={'Col1': [1,2,3], 'Col2': [2,3,4], 'Col3': [4,5,6]})
df1['Sum'] = df1['a'].apply(lambda x: None)


for index, value in df1.iterrows():
  sum = 0
  for index2, value2 in df2.iterrows():
    sum += value2[value['e']]
    
  df1['Sum'][index] = sum

Output: Output:

    a   b   c   d   e       Sum
0   1   2   dd  5   Col1    6
1   2   3   ee  9   Col2    9
2   3   4   ff  1   Col3    15

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据同一索引/行中其他 dataframe 的值求和一个 dataframe - Sum one dataframe based on value of other dataframe in same index/row 我如何根据列单元格值和 append 查找一个 dataframe 上的一行到另一个 dataframe 上的一行? - How do i lookup a row on one dataframe based on the column cell value and append that to a row on another dataframe? 如何将基于列的 dataframe 中的值添加到基于行的另一个 dataframe 中? - How do I add the value from one dataframe based on a column to another dataframe based on a row? 根据条件使用另一个数据帧列值更新一个数据帧值 - Update one dataframe value with another dataframe column value based on the condition Dataframe 基于一列分组并获得另一列所需项目的值总和 - Dataframe group based on one column and get the sum of value of desired items for another column 根据条件将一个 dataframe 中的列值设置为另一个 dataframe 列 - Setting value of columns in one dataframe to another dataframe column based on condition 如何根据另一个列值对两个数据框列求和 - How to sum two dataframe columns based on another column value 根据熊猫数据框中的另一列值计算值的总和? - Calculate the sum of values based on another column value in pandas dataframe? Pandas DataFrame:为什么我不能通过行迭代基于另一列的值来更改一列的值? - Pandas DataFrame: Why I can't change the value of one column based on value of another through row iteration? 根据另一行的值更改熊猫数据框中的一行 - Change one row in a pandas dataframe based on the value of another row
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM