简体   繁体   中英

Pandas sort_values does not sort numbers correctly

I'm new to pandas and working with tabular data in a programming environment. I have sorted a dataframe by a specific column but the answer that panda spits out is not exactly correct.

Here is the code I have used:

league_dataframe.sort_values('overall_league_position')

The result that the sort method yields values in column 'overall league position' are not sorted in ascending or order which is the default for the method.

在此处输入图像描述

What am I doing wrong? Thanks for your patience!

For whatever reason, you seem to be working with a column of strings, and sort_values is returning you a lexsorted result.

Here's an example.

df = pd.DataFrame({"Col": ['1', '2', '3', '10', '20', '19']})
df

  Col
0   1
1   2
2   3
3  10
4  20
5  19

df.sort_values('Col')

  Col
0   1
3  10
5  19
1   2
4  20
2   3

The remedy is to convert it to numeric, either using .astype or pd.to_numeric .

df.Col = df.Col.astype(float)

Or,

df.Col = pd.to_numeric(df.Col, errors='coerce')
df.sort_values('Col')

   Col
0    1
1    2
2    3
3   10
5   19
4   20

The only difference b/w astype and pd.to_numeric is that the latter is more robust at handling non-numeric strings (they're coerced to NaN ), and will attempt to preserve integers if a coercion to float is not necessary (as is seen in this case).

Using sort_naturally function instead of sort_values works well for numbers. Below is the sytax:

league_dataframe.sort_naturally('overall_league_position')

Natural sorting is distinct from the default lexicographical sorting provided by pandas .

For example, given the following list of items:

["A1", "A11", "A3", "A2", "A10"]

lexicographical sorting would give us:

["A1", "A10", "A11", "A2", "A3"]

By contrast, "natural" sorting would give us:

["A1", "A2", "A3", "A10", "A11"]

This function thus provides "natural" sorting on a single column of a data frame. For further information type the following command in the console or a jupyter notebook code cell ?league_dataframe.sort_naturally

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM