简体   繁体   English

根据相似的列值在单独的熊猫数据框中乘以列

[英]Multiplying columns in separate pandas dataframe based on similar column values

Say I have 2 data frames 说我有2个数据框

df1 = pd.DataFrame({'alpha': ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'], 
                    'number': [1, 2, 3, 4, 5, 6, 7, 8, 9]})
  alpha  number
0     A       1
1     A       2
2     A       3
3     B       4
4     B       5
5     B       6
6     C       7
7     C       8
8     C       9

df2 = pd.DataFrame({'alpha': ['A', 'B', 'C'],
                    'mult': [2, 3, 4]})    
  alpha  mult
0     A     2
1     B     3
2     C     4

And I want to create a 3rd dataframe which will multiply all of the values in df1 by the corresponding 'mult' value in df2 based on the specific alpha value. 我想创建一个第三个数据帧,它将基于特定的alpha值将df1中的所有值乘以df2中相应的“ mult”值。 The solution would look like this: 解决方案如下所示:

alpha  soln
0     A     2
1     A     4
2     A     6
3     B    12
4     B    15
5     B    18
6     C    28
7     C    32
8     C    36

Any tips on how to do this easily? 有关如何轻松做到这一点的任何提示?

The first thing I can think of is to merge the two dataframes and then do the multiplication with the merged dataframe: 我能想到的第一件事是将两个数据帧合并,然后对合并的数据帧进行乘法:

tmp = df1.merge(df2)

tmp
#   alpha  number  mult
# 0     A       1     2
# 1     A       2     2
# 2     A       3     2
# 3     B       4     3
# 4     B       5     3
# 5     B       6     3
# 6     C       7     4
# 7     C       8     4
# 8     C       9     4

df1.soln = tmp.number * tmp.mult

This works, though I do feel like there should be a simpler, one-step way too. 尽管我确实觉得应该也应该有一个更简单,一步一步的方法,但是这种方法有效。

EDIT - here is a way to do this in one line: 编辑-这是在一行中执行此操作的方法:

df1.soln = (df1.set_index("alpha").number * df2.set_index("alpha").mult).values

EDIT2 - here's another one-liner, similar to @scott-boston's comment: EDIT2-这是另一种形式,类似于@ scott-boston的注释:

df1.soln = df1.merge(df2).assign(soln=lambda df: df.number * df.mult).soln

map + multiply map + multiply

Your join is based on a single column, where the key is unique in df2 , so map. 您的联接基于单列,其中键在df2是唯一的,因此请映射。

df1['soln'] = df1.number.mul(df1.alpha.map(df2.set_index('alpha').mult))

#  alpha  number  soln
#0     A       1     2
#1     A       2     4
#2     A       3     6
#3     B       4    12
#4     B       5    15
#5     B       6    18
#6     C       7    28
#7     C       8    32
#8     C       9    36

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM