简体   繁体   English

基于 Pandas 中的两个第一列合并具有不同行数的列

[英]Merge columns with different number of rows based on two first columns in Pandas

I have two different files containing the same number of columns but different lengths, ie,我有两个不同的文件,它们包含相同数量但长度不同的列,即

file1.txt文件1.txt

1650,A,1,1,1
1650,A,1,1,1
1650,A,1,1,1
1650,B,2,2,2
1650,B,2,2,2
1650,B,2,2,2
1650,B,2,2,2
1650,B,2,2,2

file2.txt文件2.txt

1650,A,3,3,3
1650,A,3,3,3
1650,A,3,3,3
1650,A,3,3,3
1650,A,3,3,3
1650,B,4,4,4
1650,B,4,4,4

I want to concatenate both of them using pandas such that the result is as follows:我想使用 pandas 连接它们,结果如下:

1650,A,1,1,1,3,3,3
1650,A,1,1,1,3,3,3
1650,A,1,1,1,3,3,3
1650,A,NaN,NaN,NaN,3,3,3
1650,A,NaN,NaN,NaN,3,3,3
1650,B,2,2,2,4,4,4
1650,B,2,2,2,4,4,4
1650,B,2,2,2,NaN,NaN,NaN
1650,B,2,2,2,NaN,NaN,NaN
1650,B,2,2,2,NaN,NaN,NaN

I use the following codes but it seems it does not work properly:我使用以下代码,但似乎无法正常工作:

df1 = read_data('file1')
df2 = read_data('file2')
result = pd.merge_ordered(df1,df2, how='outer', on=['a', 'b'])

How to solve this problem?如何解决这个问题呢?

Use GroupBy.cumcount for counter, so possible merge by merge with add column group :使用GroupBy.cumcount作为计数器,因此可以通过merge与添加列group进行合并:

df1['group'] = df1.groupby(['a', 'b']).cumcount()
df2['group'] = df2.groupby(['a', 'b']).cumcount()
result = pd.merge(df1,df2, how='outer', on=['a', 'b', 'group']).drop('group', axis=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM