Sort a data frame in python with duplicates by a string list

Question

I have a data frame with a 250 names with values imported in python via pandas read_csv. It reads in the data:

name	val1	val2	val3
George	2.5	1.1	1.0
George	3.1	1.4	0.0
George	1.1	0.9	4.1
Tom	2.1	1.2	-3.0
Tom	3.0	-1.2	3.5
Tom	7.3	5.2	-1.2
Tom	0.1	0.1	0.1
...	...	...	...
Sally	6.1	9.1	-5.6
Sally	5.7	4.7	9.1

I want to reorder these by a particular order:

neworder = ['Sally', ..., 'George', 'Tom']

name	val1	val2	val3
Sally	6.1	9.1	-5.6
Sally	5.7	4.7	9.1
...	...	...	...
George	2.5	1.1	1.0
George	3.1	1.4	0.0
George	1.1	0.9	4.1
Tom	2.1	1.2	-3.0
Tom	3.0	-1.2	3.5
Tom	7.3	5.2	-1.2
Tom	0.1	0.1	0.1

In IDL I would do this with some for loops, but I suspect there's a sorting function in Python that my google skills have not been able to find.

Answer 1

Create a lookup dictionary for your sort somehow:

name_order = {'Sally':1, ... , 'George':12, 'Tom':13} # hand-numbered

neworder = ['Sally', ... , 'George', 'Tom']
name_order = {nm:ix for ix,nm in enumerate(neworder)} # generated

And then pass it in a lambda function to the key parameter:

df.sort_values(by='name', key=lambda nm: nm.map(name_order))

I'd need to think a bit about what happened if an unexpected name appeared; you might be able to cope with this by making name_order a collections.defaultdict .

Answer 2

This is the solution

neworder = ['Sally', ... , 'George', 'Tom']
name_order = {nm:ix for ix,nm in enumerate(neworder)} # generated
df.sort_values(by='name', key=lambda nm: nm.map(name_order))

Thanks @Joffan and @ShubhamSharma

Sort a data frame in python with duplicates by a string list

Question

2 answers

solution1
3 ACCPTED 2021-05-13 17:35:40

solution2
0 2021-05-13 18:03:45

Sort a data frame in python with duplicates by a string list

Question

2 answers

solution1 3 ACCPTED 2021-05-13 17:35:40

solution2 0 2021-05-13 18:03:45

solution1
3 ACCPTED 2021-05-13 17:35:40

solution2
0 2021-05-13 18:03:45