简体   繁体   English

pandas) 如何在 sort_values 中使用 kind 选项

[英]pandas) how to use kind option in sort_values

Hi I want to sort dataframe by value in column column'values are string combination with number.嗨,我想按列中的值对数据框进行排序,列的值是带数字的字符串组合。 I want to sort by number in values by splited So I searched some modules to pick only number from list and apply kind option in sort_values.. but It didn't work.. Without kind option, it sort by 'D1 D10 D11 D2 D3..'.我想通过拆分按值中的数字排序所以我搜索了一些模块以仅从列表中选择数字并在 sort_values 中应用 kind 选项..但它没有用..没有 kind 选项,它按'D1 D10 D11 D2 D3 ..'. I want sort 'D1 D2 D3 D4..D10 D11' Can you help me?我想要排序 'D1 D2 D3 D4..D10 D11' 你能帮我吗?

python # I want to sort by D1 D2 D3 D4 D5 D10 D11... df[Xlabel] = ['D1','D2','D3','D4','D5','D10','D11'] python # 我想按 D1 D2 D3 D4 D5 D10 D11 排序... df[Xlabel] = ['D1','D2','D3','D4','D5','D10','D11' ]

 def atoi(text):
  return int(text) if text.isdigit() else text
 def natural_keys(text):
  return [ atoi(c) for c in re.split('(\d+)',text) ]

 # my trying but didn't work with error message like below..
 df.sort_values(by=[Xlabel], inplace=True, kind=natural_keys[list(df[Xlabel])])

 # my trying working well but it didn't sort well
 # It sort by ( D1 D10 D11 D2 D3... ) it's not my hope
 df.sort_values(by=[Xlabel], inplace=True])
#error message when trying my method
df.sort_values(by=[Xlabel], inplace=True, kind=natural_keys[list(df[Xlabel])])
TypeError: 'function' object is not subscriptable

I think here should be better use natsort with convert column to ordered categoricals:我认为这里应该更好地使用natsort和将列转换为有序分类:

df = pd.DataFrame({'Xlabel':['D1','D2','D3','D4','D5','D10','D11']})

import natsort as ns

df['Xlabel'] = pd.Categorical(df['Xlabel'],
                              ordered=True,
                              categories= ns.natsorted(df['Xlabel'].unique()))
df = df.sort_values('Xlabel')
print (df)
  Xlabel
0     D1
1     D2
2     D3
3     D4
4     D5
5    D10
6    D11

Also I think in new version of pandas this should be possible with new parameter key , check this .另外我认为在新版本的 Pandas 中,这应该可以使用新的参数key ,检查这个

函数应该由括号使用,而不是方括号,请尝试使用:

df.sort_values(by=[Xlabel], inplace=True, kind=natural_keys(list(df[Xlabel])))

Update for pandas 1.1.0 sort_values now has key parameter: pandas 1.1.0 的更新sort_values现在具有关键参数:

df.sort_values('Xlabel', key=lambda x: x.str.extract('(\d+)').squeeze().astype(int))

Output:输出:

  Xlabel
0     D1
1     D2
2     D3
3     D4
4     D5
5    D10
6    D11

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM