简体   繁体   English

附加两个具有相同列,不同顺序的数据帧

[英]Appending two dataframes with same columns, different order

I have two pandas dataframes. 我有两个pandas数据帧。

noclickDF = DataFrame([[0,123,321],[0,1543,432]], columns=['click', 'id','location'])
clickDF = DataFrame([[1,123,421],[1,1543,436]], columns=['click', 'location','id'])

I simply want to join such that the final DF will look like: 我只是想加入,以便最终的DF看起来像:

click  |  id   |   location
0         123        321
0         1543       432
1         421        123
1         436       1543

As you can see the column names of both original DF's are the same, but not in the same order. 正如您所看到的,两个原始DF的列名相同,但顺序不同。 Also there is no join in a column. 列中也没有连接。

You could also use pd.concat : 你也可以使用pd.concat

In [36]: pd.concat([noclickDF, clickDF], ignore_index=True)
Out[36]: 
   click    id  location
0      0   123       321
1      0  1543       432
2      1   421       123
3      1   436      1543

Under the hood, DataFrame.append calls pd.concat . 在引擎盖下, DataFrame.append调用pd.concat DataFrame.append has code for handling various types of input, such as Series, tuples, lists and dicts. DataFrame.append具有用于处理各种类型输入的代码,例如Series,元组,列表和dicts。 If you pass it a DataFrame, it passes straight through to pd.concat , so using pd.concat is a bit more direct. 如果你传递一个DataFrame,它会直接传递给pd.concat ,所以使用pd.concat会更直接。

For future users (sometime >pandas 0.23.0 ): 对于未来的用户(有时> pandas 0.23.0 ):

You may also need to add sort=True to sort the non-concatenation axis when it is not already aligned (ie to retain the OP's desired concatenation behavior). 您可能还需要添加sort = True以在非连接轴尚未对齐时对其进行排序(即保留OP所需的连接行为)。 I used the code contributed above and got a warning, see Python Pandas User Warning . 我使用上面提供的代码并收到警告,请参阅Python Pandas用户警告 The code below works and does not throw a warning. 下面的代码有效并且不会发出警告。

In [36]: pd.concat([noclickDF, clickDF], ignore_index=True, sort=True)
Out[36]: 
   click    id  location
0      0   123       321
1      0  1543       432
2      1   421       123
3      1   436      1543

You can use append for that 您可以使用追加

 df = noclickDF.append(clickDF)
 print df 

    click    id  location
 0      0   123       321  
 1      0  1543       432
 0      1   421       123
 1      1   436      1543

and if you need you can reset the index by 如果需要,可以重置索引

df.reset_index(drop=True)
print df
   click    id  location
0      0   123       321
1      0  1543       432
2      1   421       123
3      1   436      1543

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM