sort_values() 与 'key' 对 dataframe 中的一列元组进行排序

Question

I have the following df:我有以下df：

df = pd.DataFrame({'Params': {0: (400, 30), 1: (2000, 10), 2: (1200, 10), 3: (2000, 30), 4: (1600, None)}, 'mean_test_score': {0: -0.6197478578718253, 1: -0.6164605619489576, 2: -0.6229674626212879, 3: -0.7963084775995496, 4: -0.7854265341671137}})

I wish to sort it according to the first element of the tuples in the first column.我希望根据第一列中元组的第一个元素对其进行排序。

First column of the desired output: {'Params': {0: (400, 30), 2: (1200, 10), 4: (1600, 10), 1: (2000, 10), 3: (2000, 30),所需 output 的第一列：{'Params': {0: (400, 30), 2: (1200, 10), 4: (1600, 10), 1: (2000, 10), 3: (2000, 30),

I have tried to use df.sort_values(by=('Params'), key=lambda x:x[0]) like I would do with a list and.sort but I get the following value error: ValueError: User-provided key function must not change the shape of the array.我尝试使用df.sort_values(by=('Params'), key=lambda x:x[0])就像我对列表和.sort 所做的那样，但我得到以下值错误： ValueError: User-provided键function must not change the shape of the array.

I have looked at the documentation of sort_values() but it did not help much about why lambda does not work.我查看了sort_values()的文档，但它对为什么 lambda 不起作用没有太大帮助。

EDIT: Following @DeepSpace input, the use of df.sort_values(by='Params') gives '<' not supported between instances of 'NoneType' and 'int' due to the presence of None in the second part of the tuples编辑：在@DeepSpace 输入之后，使用df.sort_values(by='Params') '<' not supported between instances of 'NoneType' and 'int'因为在元组的第二部分中存在None

Anyone could help?任何人都可以帮忙吗？

Answer 1

The document of sort_values() says sort_values()的文档说

key should expect a Series and return a Series with the same shape as the input. key应该期望一个Series并返回一个与输入具有相同形状的系列。

In df.sort_values(by=('Params'), key=lambda x:x[0]) , the x is actually the Params column.在df.sort_values(by=('Params'), key=lambda x:x[0])中， x实际上是Params列。 By accessing x with x[0] , you are returning the first element of x Series, which is not the same shape as input Series.通过使用x[0]访问x ，您将返回x系列的第一个元素，它与输入系列的形状不同。 Thus gives you the error.因此给你错误。

If you want to sort by the first element of tuple, you can do如果你想按元组的第一个元素排序，你可以这样做

df.sort_values(by='Params', key=lambda col: col.map(lambda x: x[0]))

sort_values() 与 'key' 对 dataframe 中的一列元组进行排序

问题描述

1 个解决方案

解决方案1
3 已采纳 2021-04-01 15:41:45

sort_values() 与 'key' 对 dataframe 中的一列元组进行排序

问题描述

1 个解决方案

解决方案1 3 已采纳 2021-04-01 15:41:45

解决方案1
3 已采纳 2021-04-01 15:41:45