[英]Preserving Column Order - Python Pandas and Column Concat
So my google-fu doesn't seem to be doing me justice with what seems like should be a trivial procedure. 因此,我的google-fu似乎并没有让我正义,看起来应该是一个微不足道的程序。
In Pandas for Python I have 2 datasets, I want to merge them. 在Pandas for Python中我有2个数据集,我想合并它们。 This works fine using .concat.
使用.concat可以正常工作。 The issue is, .concat reorders my columns.
问题是,.concat重新排序我的列。 From a data retrieval point of view, this is trivial.
从数据检索的角度来看,这是微不足道的。 From a "I just want to open the file and quickly see the most important column" point of view, this is annoying.
从“我只是想打开文件并快速查看最重要的列”的角度来看,这很烦人。
File1.csv
Name Username Alias1
Tom Tomfoolery TJZ
Meryl MsMeryl Mer
Timmy Midsize Yoda
File2.csv
Name Username Alias 1 Alias 2
Bob Firedbob Fire Gingy
Tom Tomfoolery TJZ Awww
Result.csv
Alias1 Alias2 Name Username
0 TJZ NaN Tom Tomfoolery
1 Mer NaN Meryl MsMeryl
2 Yoda NaN Timmy Midsize
0 Fire Gingy Bob Firedbob
1 TJZ Awww Tom Tomfoolery
The result is fine, but in the data-file I'm working with I have 1,000 columns. 结果很好,但在我正在使用的数据文件中,我有1,000列。 The 2-3 most important are now in the middle.
最重要的2-3个现在位于中间。 Is there a way, in this toy example, I could've forced "Username" to be the first column and "Name" to be the second column, preserving the values below each all the way down obviously.
有没有办法,在这个玩具示例中,我可以强迫“Username”成为第一列,“Name”成为第二列,显然保留了每个下面的值。
Also as a side note, when I save to file it also saves that numbering on the side (0 1 2 0 1). 另外作为旁注,当我保存到文件时,它也会在侧面保存该编号(0 1 2 0 1)。 If theres a way to prevent that too, that'd be cool.
如果这是一种防止这种情况的方法,那就太酷了。 If not, its not a big deal since it's a quick fix to remove.
如果没有,它不是一个大问题,因为它是一个快速修复删除。
Thanks! 谢谢!
Assuming the concatenated DataFrame is df
, you can perform the reordering of columns as follows: 假设连接的DataFrame是
df
,您可以按如下方式执行列的重新排序:
important = ['Username', 'Name']
reordered = important + [c for c in df.columns if c not in important]
df = df[reordered]
print df
Output: 输出:
Username Name Alias1 Alias2
0 Tomfoolery Tom TJZ NaN
1 MsMeryl Meryl Mer NaN
2 Midsize Timmy Yoda NaN
0 Firedbob Bob Fire Gingy
1 Tomfoolery Tom TJZ Awww
The list of numbers [0, 1, 2, 0, 1]
is the index of the DataFrame. 数字列表
[0, 1, 2, 0, 1]
0,1,2,0,1 [0, 1, 2, 0, 1]
是DataFrame的索引。 To prevent them from being written to the output file, you can use the index=False
option in to_csv()
: 要防止将它们写入输出文件,可以使用
to_csv()
的index=False
选项:
df.to_csv('Result.csv', index=False, sep=' ')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.