[英]sort_values() with 'key' to sort a column of tuples in a dataframe
I have the following df:我有以下df:
df = pd.DataFrame({'Params': {0: (400, 30), 1: (2000, 10), 2: (1200, 10), 3: (2000, 30), 4: (1600, None)}, 'mean_test_score': {0: -0.6197478578718253, 1: -0.6164605619489576, 2: -0.6229674626212879, 3: -0.7963084775995496, 4: -0.7854265341671137}})
I wish to sort it according to the first element of the tuples in the first column.我希望根据第一列中元组的第一个元素对其进行排序。
First column of the desired output: {'Params': {0: (400, 30), 2: (1200, 10), 4: (1600, 10), 1: (2000, 10), 3: (2000, 30),所需 output 的第一列:{'Params': {0: (400, 30), 2: (1200, 10), 4: (1600, 10), 1: (2000, 10), 3: (2000, 30),
I have tried to use df.sort_values(by=('Params'), key=lambda x:x[0])
like I would do with a list and.sort but I get the following value error: ValueError: User-provided
key function must not change the shape of the array.
我尝试使用df.sort_values(by=('Params'), key=lambda x:x[0])
就像我对列表和.sort 所做的那样,但我得到以下值错误: ValueError: User-provided
键function must not change the shape of the array.
I have looked at the documentation of sort_values() but it did not help much about why lambda does not work.我查看了sort_values()的文档,但它对为什么 lambda 不起作用没有太大帮助。
EDIT: Following @DeepSpace input, the use of df.sort_values(by='Params')
gives '<' not supported between instances of 'NoneType' and 'int'
due to the presence of None
in the second part of the tuples编辑:在@DeepSpace 输入之后,使用df.sort_values(by='Params')
'<' not supported between instances of 'NoneType' and 'int'
因为在元组的第二部分中存在None
Anyone could help?任何人都可以帮忙吗?
The document of sort_values() says sort_values()的文档说
key
should expect aSeries
and return a Series with the same shape as the input.key
应该期望一个Series
并返回一个与输入具有相同形状的系列。
In df.sort_values(by=('Params'), key=lambda x:x[0])
, the x
is actually the Params
column.在df.sort_values(by=('Params'), key=lambda x:x[0])
中, x
实际上是Params
列。 By accessing x
with x[0]
, you are returning the first element of x
Series, which is not the same shape as input Series.通过使用x[0]
访问x
,您将返回x
系列的第一个元素,它与输入系列的形状不同。 Thus gives you the error.因此给你错误。
If you want to sort by the first element of tuple, you can do如果你想按元组的第一个元素排序,你可以这样做
df.sort_values(by='Params', key=lambda col: col.map(lambda x: x[0]))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.