[英]Can I combine values of two rows if labels are not identical in pandas
Here's the 2 dataframes I want to combine.这是我想合并的 2 个数据框。 But the labels are different from each other
但是标签是不一样的
df1
Date Campaign Sales
11/07/2020 AMZ CT BR Leather Shoes ABCDEFG1234 $10
11/07/2020 AMZ CT NB Leather Shoes ABCDEFG1234 $20
11/07/2020 AMZ OG BR Bag HGIJK567 $30
11/07/2020 AMZ OG NB Bag HGIJK567 Desktop $40
df2
Date Campaign Spend
11/07/2020 GA BR Leather Shoes ABCDEFG1234 $5
11/07/2020 GA NB Leather Shoes ABCDEFG1234 $6
11/07/2020 GA BR Bag HGIJK567 $7
11/07/2020 GA NB Bag HGIJK567 Desktop $8
Here's the output I want这是我想要的输出
df3
Date Campaign Spend Sales
11/07/2020 CT BR Leather Shoes ABCDEFG1234 $5 $10
11/07/2020 CT NB Leather Shoes ABCDEFG1234 $6 $20
11/07/2020 OG BR Bag HGIJK567 $7 $30
11/07/2020 OG NB Bag HGIJK567 Desktop $8 $40
I would create an extra column to perform the merge
on.我会创建一个额外的列来执行
merge
。 For what I can see, merging is done based on the product name without the first acronyms.就我所见,合并是根据没有首字母缩略词的产品名称完成的。
df1['Campaign_j'] = df1['Campaign'].map(lambda x: ' '.join(x.split()[3:]))
df2['Campaign_j'] = df2['Campaign'].map(lambda x: ' '.join(x.split()[2:]))
print(df1)
print(df2)
df3 = df1.merge(df2,how='left',on=['Campaign_j'],suffixes=('','_x')).drop_duplicates('Campaign_x')[['Campaign','Sales','Spend']]
After the joining, we will drop the duplicates from the first Campaign column (Campaign_x) and finally select the desired columns.加入后,我们将从第一个 Campaign 列 (Campaign_x) 中删除重复项,最后选择所需的列。 I have not added the
date
column because it has no effect in this problem.我没有添加
date
列,因为它对这个问题没有影响。 Output:输出:
Campaign Sales Costs
0 AMZ CT BR Leather Shoes ABCDEFG1234 10 5
2 AMZ CT NB Leather Shoes ABCDEFG1234 20 6
4 AMZ OG BR Bag HGIJK567 30 7
5 AMZ OG NB Bag HGIJK567 Desktop 40 8
If I understand your question correctly如果我正确理解你的问题
Yes, you can.是的你可以。 But rows that are not in one of dataframes are left blank.
但是不在数据帧之一中的行留空。
Let me give you an example: If you have two dataframe First.csv
and Second.csv
as follows:让我举个例子:如果你有两个数据
First.csv
和Second.csv
如下:
First dataframe:第一个数据框:
A, B, C
1, 2, 3
2, 3, 4
Second dataframe:第二个数据框:
A, C
1, 3
2, 4
import pandas as pd
df_a = pd.read_csv('First.csv')
df_b = pd.read_csv('Second.csv')
You can use:您可以使用:
df_row_merged = pd.concat([df_a, df_b], ignore_index=True).
to merge two dataframes.合并两个数据帧。
df_row_merged
will be as follows: df_row_merged
将如下所示:
A, B, C
1, 2.0, 3
2, 3.0, 4
1, , 3
2, , 4
I hope this help you.我希望这对你有帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.