简体   繁体   English

保留列顺序 - Python Pandas和列Concat

[英]Preserving Column Order - Python Pandas and Column Concat

So my google-fu doesn't seem to be doing me justice with what seems like should be a trivial procedure. 因此,我的google-fu似乎并没有让我正义,看起来应该是一个微不足道的程序。

In Pandas for Python I have 2 datasets, I want to merge them. 在Pandas for Python中我有2个数据集,我想合并它们。 This works fine using .concat. 使用.concat可以正常工作。 The issue is, .concat reorders my columns. 问题是,.concat重新排序我的列。 From a data retrieval point of view, this is trivial. 从数据检索的角度来看,这是微不足道的。 From a "I just want to open the file and quickly see the most important column" point of view, this is annoying. 从“我只是想打开文件并快速查看最重要的列”的角度来看,这很烦人。

File1.csv
Name    Username    Alias1 
Tom     Tomfoolery   TJZ
Meryl   MsMeryl      Mer
Timmy   Midsize      Yoda

File2.csv
Name    Username   Alias 1   Alias 2
Bob     Firedbob   Fire      Gingy
Tom     Tomfoolery  TJZ      Awww

Result.csv
    Alias1 Alias2   Name    Username
0   TJZ    NaN      Tom     Tomfoolery
1   Mer    NaN      Meryl   MsMeryl
2   Yoda   NaN      Timmy   Midsize
0   Fire   Gingy    Bob     Firedbob
1   TJZ    Awww     Tom     Tomfoolery

The result is fine, but in the data-file I'm working with I have 1,000 columns. 结果很好,但在我正在使用的数据文件中,我有1,000列。 The 2-3 most important are now in the middle. 最重要的2-3个现在位于中间。 Is there a way, in this toy example, I could've forced "Username" to be the first column and "Name" to be the second column, preserving the values below each all the way down obviously. 有没有办法,在这个玩具示例中,我可以强迫“Username”成为第一列,“Name”成为第二列,显然保留了每个下面的值。

Also as a side note, when I save to file it also saves that numbering on the side (0 1 2 0 1). 另外作为旁注,当我保存到文件时,它也会在侧面保存该编号(0 1 2 0 1)。 If theres a way to prevent that too, that'd be cool. 如果这是一种防止这种情况的方法,那就太酷了。 If not, its not a big deal since it's a quick fix to remove. 如果没有,它不是一个大问题,因为它是一个快速修复删除。

Thanks! 谢谢!

Assuming the concatenated DataFrame is df , you can perform the reordering of columns as follows: 假设连接的DataFrame是df ,您可以按如下方式执行列的重新排序:

important = ['Username', 'Name']
reordered = important + [c for c in df.columns if c not in important]
df = df[reordered]
print df

Output: 输出:

     Username   Name Alias1 Alias2
0  Tomfoolery    Tom    TJZ    NaN
1     MsMeryl  Meryl    Mer    NaN
2     Midsize  Timmy   Yoda    NaN
0    Firedbob    Bob   Fire  Gingy
1  Tomfoolery    Tom    TJZ   Awww

The list of numbers [0, 1, 2, 0, 1] is the index of the DataFrame. 数字列表[0, 1, 2, 0, 1] 0,1,2,0,1 [0, 1, 2, 0, 1]是DataFrame的索引。 To prevent them from being written to the output file, you can use the index=False option in to_csv() : 要防止将它们写入输出文件,可以使用to_csv()index=False选项:

df.to_csv('Result.csv', index=False, sep=' ')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM