[英]Merge columns with different number of rows based on two first columns in Pandas
I have two different files containing the same number of columns but different lengths, ie,我有两个不同的文件,它们包含相同数量但长度不同的列,即
file1.txt文件1.txt
1650,A,1,1,1
1650,A,1,1,1
1650,A,1,1,1
1650,B,2,2,2
1650,B,2,2,2
1650,B,2,2,2
1650,B,2,2,2
1650,B,2,2,2
file2.txt文件2.txt
1650,A,3,3,3
1650,A,3,3,3
1650,A,3,3,3
1650,A,3,3,3
1650,A,3,3,3
1650,B,4,4,4
1650,B,4,4,4
I want to concatenate both of them using pandas such that the result is as follows:我想使用 pandas 连接它们,结果如下:
1650,A,1,1,1,3,3,3
1650,A,1,1,1,3,3,3
1650,A,1,1,1,3,3,3
1650,A,NaN,NaN,NaN,3,3,3
1650,A,NaN,NaN,NaN,3,3,3
1650,B,2,2,2,4,4,4
1650,B,2,2,2,4,4,4
1650,B,2,2,2,NaN,NaN,NaN
1650,B,2,2,2,NaN,NaN,NaN
1650,B,2,2,2,NaN,NaN,NaN
I use the following codes but it seems it does not work properly:我使用以下代码,但似乎无法正常工作:
df1 = read_data('file1')
df2 = read_data('file2')
result = pd.merge_ordered(df1,df2, how='outer', on=['a', 'b'])
How to solve this problem?如何解决这个问题呢?
Use GroupBy.cumcount
for counter, so possible merge by merge
with add column group
:使用
GroupBy.cumcount
作为计数器,因此可以通过merge
与添加列group
进行合并:
df1['group'] = df1.groupby(['a', 'b']).cumcount()
df2['group'] = df2.groupby(['a', 'b']).cumcount()
result = pd.merge(df1,df2, how='outer', on=['a', 'b', 'group']).drop('group', axis=1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.