繁体   English   中英

在 Python Pandas 中合并两个数据集

[英]Merge two data-sets in Python Pandas

我有以下格式的两个数据集,并希望将它们合并为一个基于 City+Age+Gender 的数据集。 提前致谢

数据集 1:

        City    Age  Gender            Source         Count
0  California  15-24  Female  Amazon Prime Video       14629
1  California  15-24  Female             Fubo TV        3840
2  California  15-24  Female                Hulu       54067
3  California  15-24  Female             Netflix       11713
4  California  15-24  Female            Sling TV       10642

数据集2:

         City    Age  Gender           Source     Feeds
0  California  15-24  Female             Blogs    150
1  California  15-24  Female        Customsite     57
2  California  15-24  Female       Discussions     28
3  California  15-24  Female  Facebook Comment    555
4  California  15-24  Female           Google+     19

预期结果数据集:

    City      Age   Gender            Source          Count
  California  15-24  Female  Amazon Prime Video       14629
  California  15-24  Female             Fubo TV        3840
  California  15-24  Female                Hulu       54067
  California  15-24  Female             Netflix       11713
  California  15-24  Female            Sling TV       10642
  California  15-24  Female             Blogs          150
  California  15-24  Female        Customsite           57
  California  15-24  Female       Discussions           28
  California  15-24  Female  Facebook Comment          555
  California  15-24  Female           Google+           19

注意:Feeds/Count 表示相同的含义。 所以可以将它们中的任何一个作为合并数据集中的列名。

使用带有rename列的pandas.concat来对齐列 - 在both DataFrames需要相同的列:

df = pd.concat([df1, df2.rename(columns={'Feeds':'Count'})], ignore_index=True)
print (df)
         City    Age  Gender              Source  Count
0  California  15-24  Female  Amazon Prime Video  14629
1  California  15-24  Female             Fubo TV   3840
2  California  15-24  Female                Hulu  54067
3  California  15-24  Female             Netflix  11713
4  California  15-24  Female            Sling TV  10642
5  California  15-24  Female               Blogs    150
6  California  15-24  Female          Customsite     57
7  California  15-24  Female         Discussions     28
8  California  15-24  Female    Facebook Comment    555
9  California  15-24  Female             Google+     19

替代DataFrame.append - 不是纯python append

df = df1.append(df2.rename(columns={'Feeds':'Count'}), ignore_index=True)
print (df)
         City    Age  Gender              Source  Count
0  California  15-24  Female  Amazon Prime Video  14629
1  California  15-24  Female             Fubo TV   3840
2  California  15-24  Female                Hulu  54067
3  California  15-24  Female             Netflix  11713
4  California  15-24  Female            Sling TV  10642
5  California  15-24  Female               Blogs    150
6  California  15-24  Female          Customsite     57
7  California  15-24  Female         Discussions     28
8  California  15-24  Female    Facebook Comment    555
9  California  15-24  Female             Google+     19

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM