[英]Merge two data-sets in Python Pandas
I have two datasets in the below format & want to merge them into a single dataset based on City+Age+Gender.我有以下格式的两个数据集,并希望将它们合并为一个基于 City+Age+Gender 的数据集。 Thanks in advance
提前致谢
Dataset1:数据集 1:
City Age Gender Source Count
0 California 15-24 Female Amazon Prime Video 14629
1 California 15-24 Female Fubo TV 3840
2 California 15-24 Female Hulu 54067
3 California 15-24 Female Netflix 11713
4 California 15-24 Female Sling TV 10642
Dataset2:数据集2:
City Age Gender Source Feeds
0 California 15-24 Female Blogs 150
1 California 15-24 Female Customsite 57
2 California 15-24 Female Discussions 28
3 California 15-24 Female Facebook Comment 555
4 California 15-24 Female Google+ 19
Expected resulting dataset:预期结果数据集:
City Age Gender Source Count
California 15-24 Female Amazon Prime Video 14629
California 15-24 Female Fubo TV 3840
California 15-24 Female Hulu 54067
California 15-24 Female Netflix 11713
California 15-24 Female Sling TV 10642
California 15-24 Female Blogs 150
California 15-24 Female Customsite 57
California 15-24 Female Discussions 28
California 15-24 Female Facebook Comment 555
California 15-24 Female Google+ 19
Note : Feeds/Count signify the same meaning.注意:Feeds/Count 表示相同的含义。 So okay to have either of them as the column name in the merged dataset.
所以可以将它们中的任何一个作为合并数据集中的列名。
Use pandas.concat
with rename
columns for align columns - need same columns in both DataFrames
:使用带有
rename
列的pandas.concat
来对齐列 - 在both DataFrames
需要相同的列:
df = pd.concat([df1, df2.rename(columns={'Feeds':'Count'})], ignore_index=True)
print (df)
City Age Gender Source Count
0 California 15-24 Female Amazon Prime Video 14629
1 California 15-24 Female Fubo TV 3840
2 California 15-24 Female Hulu 54067
3 California 15-24 Female Netflix 11713
4 California 15-24 Female Sling TV 10642
5 California 15-24 Female Blogs 150
6 California 15-24 Female Customsite 57
7 California 15-24 Female Discussions 28
8 California 15-24 Female Facebook Comment 555
9 California 15-24 Female Google+ 19
Alternative with DataFrame.append
- not pure python append
:替代
DataFrame.append
- 不是纯python append
:
df = df1.append(df2.rename(columns={'Feeds':'Count'}), ignore_index=True)
print (df)
City Age Gender Source Count
0 California 15-24 Female Amazon Prime Video 14629
1 California 15-24 Female Fubo TV 3840
2 California 15-24 Female Hulu 54067
3 California 15-24 Female Netflix 11713
4 California 15-24 Female Sling TV 10642
5 California 15-24 Female Blogs 150
6 California 15-24 Female Customsite 57
7 California 15-24 Female Discussions 28
8 California 15-24 Female Facebook Comment 555
9 California 15-24 Female Google+ 19
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.