[英]Python: How to force pandas to sort columns in a dataset?
I have below dataset我有以下数据集
df = pd.DataFrame({2002:[None, None, 2, 4, 5],
"Facility":[5, 5, 6, 44, 2],
2003:[None, None, None, 1, 5],
2004 : [ 4,4,3,2,6]})
and I need to sort the columns, in order do so I use the following code我需要对列进行排序,为此我使用以下代码
df = df.reindex(sorted(df.columns), axis=1)
however it complains with the following error:但是它抱怨以下错误:
TypeError: '<' not supported between instances of 'str' and 'int'
I know that error appears since one of col names is str type, but how can I solve this problem?我知道出现错误是因为 col 名称之一是 str 类型,但我该如何解决这个问题?
My favorit answer has the sorted columns as below:我最喜欢的答案有如下排序的列:
'Facility',2002,2003,2004
You are almost there.你快到了。
As you already mentioned, your colnames is a combination of String and int therefore the sort is not successful.正如您已经提到的,您的 colnames 是 String 和 int 的组合,因此排序不成功。 So, you can do the following to sort the columns
因此,您可以执行以下操作对列进行排序
df.columns.astype(str)
Setup: ensure all values that are supposed to be strings are str, and integers are int设置:确保所有应该是字符串的值都是 str,整数是 int
1) Get a list of the columns: 1)获取列的列表:
my_columns = list(df.columns)
2) Remove "Facility" from the list: 2)从列表中删除“设施”:
my_columns.remove("Facility")
3) Sort the list of integers: 3) 对整数列表进行排序:
my_columns.sort()
4) Insert facility into the front of the list: 4)将设施插入列表的前面:
my_columns.insert(0, "Facility")
5) Reorder the DataFrame with the newly ordered my_columns
: 5) 使用新订购的 my_columns 重新订购
my_columns
:
df = df[my_columns]
6) Change columns back to all strings with something like: 6)将列更改回所有字符串,例如:
df.columns.astype(str)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.