[英]Preserving column order in the pandas to_csv method
The to_csv method of pandas does not preserve the order of columns. pandas 的 to_csv 方法不保留列的顺序。 It chooses to alphabetically arrange the columns in CSV.它选择按字母顺序排列 CSV 中的列。 This is a bug and has been reported and is supposed to be corrected in version 0.11.0.这是一个错误,已被报告,应该在 0.11.0 版中更正。 I have 0.18.0.我有 0.18.0。
import pandas as pd
df = pd.DataFrame({'V_pod_error' : [a],
'V_pod_used' : [b],
'U_sol_type' : [c]
...
... and so on upto 50 columns }
pd.to_csv(df)
Excel order: Excel订单:
0 U_sol type V_pod_error V_pod_used ...
1
What I want is order in the dictionary:我想要的是字典中的顺序:
0 V_pod_error V_pod_used U_sol type ...
1
I have a huge number of columns and names.我有大量的列和名称。 I cannot do it manually or write out the column order.我无法手动完成或写出列顺序。 There has been the exact same question in 2013 here . 2013 年这里有完全相同的问题。 And it doesnt look like there is an update!而且看起来好像没有更新! I would like to ask the community to help me out!我想请社区帮助我! This is really problematic.这确实有问题。
Try the following solution.尝试以下解决方案。 Even I faced the same issue.甚至我也面临同样的问题。 I solved it as follows:我是这样解决的:
import pandas as pd
df = pd.DataFrame({'V_pod_error' : [a],
'V_pod_used' : [b],
'U_sol_type' : [c]
...
... and so on upto 50 columns }
column_order = ['V_pod_error', 'V_pod_used', 'U_sol_type',.....# upto 50 column names]
df[column_order].to_csv(file_name)
I think problem is in DataFrame
constructor, because you need add parameter columns
for custom ordering of columns.我认为问题出在DataFrame
构造函数中,因为您需要为columns
的自定义排序添加参数列。 If you dont set parameter columns, columns are ordered alphanumerical.如果不设置参数列,则列按字母数字顺序排列。
import pandas as pd
df = pd.DataFrame({'V_pod_error' : [0,2],
'V_pod_used' : [6,4],
'U_sol_type' : [7,8]})
print df
U_sol_type V_pod_error V_pod_used
0 7 0 6
1 8 2 4
print df.to_csv()
,U_sol_type,V_pod_error,V_pod_used
0,7,0,6
1,8,2,4
df1 = pd.DataFrame({'V_pod_error' : [0,2],
'V_pod_used' : [6,4],
'U_sol_type' : [7,8]},
columns=['V_pod_error','V_pod_used','U_sol_type'])
print df1
V_pod_error V_pod_used U_sol_type
0 0 6 7
1 2 4 8
print df1.to_csv()
,V_pod_error,V_pod_used,U_sol_type
0,0,6,7
1,2,4,8
EDIT:编辑:
Another solution is set order of column by subset before write to_csv
(thanks Mathias711 ):另一个解决方案是在写入to_csv
之前按子集设置列顺序(感谢Mathias711 ):
import pandas as pd
df = pd.DataFrame({'V_pod_error' : [0,2],
'V_pod_used' : [6,4],
'U_sol_type' : [7,8]})
print df
U_sol_type V_pod_error V_pod_used
0 7 0 6
1 8 2 4
df = df[['V_pod_error','V_pod_used','U_sol_type']]
print df
V_pod_error V_pod_used U_sol_type
0 0 6 7
1 2 4 8
EDIT1: Maybe help first convert dict
to OrderedDict
and then create DataFrame
: EDIT1:也许有助于首先将dict
转换为OrderedDict
然后创建DataFrame
:
import collections
import pandas as pd
d = {'V_pod_error' : [0,2],'V_pod_used' : [6,4], 'U_sol_type' : [7,8]}
print d
{'V_pod_error': [0, 2], 'V_pod_used': [6, 4], 'U_sol_type': [7, 8]}
print pd.DataFrame(d)
U_sol_type V_pod_error V_pod_used
0 7 0 6
1 8 2 4
d1 = collections.OrderedDict(d)
print d1
OrderedDict([('V_pod_error', [0, 2]), ('V_pod_used', [6, 4]), ('U_sol_type', [7, 8])])
print pd.DataFrame(d1)
V_pod_error V_pod_used U_sol_type
0 0 6 7
1 2 4 8
Try with:尝试:
df.to_csv(file_name, sep=',', encoding='utf-8', header=True, columns=["Col1","Col2","Col3","Col4"])
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html
The question is very old but wish to provide my solution to the question of "Preserving the order of columns while reading a csv file into pandas data frame": 这个问题很老,但是希望为“在将csv文件读入pandas数据帧时保留列的顺序”问题提供我的解决方案:
import numpy as np
import pandas as pd
# Get column count as a list
cols = np.arange(0, hmprice.shape[1])
df = pd.read_csv('train.csv', usecols=cols)
df.head()
Default Order of dataframe: 数据框的默认顺序:
Preserved order of dataframe: 数据帧的保留顺序:
NOTE : The attribute usecols can take either column names or column indices; 注意 :属性usecols可以采用列名或列索引;否则,不能使用列名。 but pandas doesn't honor "any other order" of column names or column indices. 但是pandas不遵守列名或列索引的“任何其他顺序”。
For example, 例如,
df = pd.read_csv('train.csv', usecols=[1, 2, 3])<br/>
or
df = pd.read_csv('train.csv', usecols=[3, 2, 1])<br/>
gives the same result. 给出相同的结果。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.