My question is basically the same as the one here: Sorting a pandas DataFrame by one level of a MultiIndex
id est, I want to sort a MultiIndex dataframe along one level, BUT I am facing the problem that the following index: ["foo2","foo1","foo10"] is sorted in ["foo1","foo10","foo2"] instead of ["foo1","foo2","foo10"] and I cannot pass a "key" argument like for the list.sort() function (see example below). How should I manage that? Should I reset_index, sort the column, and then set the index again?
import pandas as pd
import re
def atoi(text):
return int(text) if text.isdigit() else text
def natural_keys(text):
return [atoi(c) for c in re.split('(\d+)',text)]
# example on a list
L1=["foo2","foo1","foo10"]
print(sorted(L1))
print(sorted(L1,key=natural_keys))
print()
df = pd.DataFrame([{'I1':'foo2','I2':'b','val':2},{'I1':'foo1','I2':'a','val':1},{'I1':'foo10','I2':'c','val':3}])
df = df.set_index(['I1','I2'])
sorted_df = df.sort_index(level=0)
print(sorted_df)
print()
expected_df = pd.DataFrame([{'I1':'foo1','I2':'a','val':1},{'I1':'foo2','I2':'b','val':2},{'I1':'foo10','I2':'c','val':3}])
expected_df = expected_df.set_index(['I1','I2'])
print(expected_df)
val
I1 I2
foo1 a 1
foo10 c 3
foo2 b 2
EXPECTED DF:
val
I1 I2
foo1 a 1
foo2 b 2
foo10 c 3
Thanks
As explained by Jon Clements, if you are on a version of pandas >= 1.0.0 you can use the key argument of sort index. but if you also want to discriminate between several numbers in your index: foo_1_bar_2 foo_2_bar_1 in this order then you need to combine several function:
import pandas as pd
import re
def atoi(text):
return int(text) if text.isdigit() else text
def natural_keys(text):
return [atoi(c) for c in re.split('(\d+)',text)]
def sort_index(index):
return [sorted(index,key=natural_keys,reverse=False).index(val) for val in index]
df = pd.DataFrame([{'I1':'foo2','I2':'b','val':2},{'I1':'foo1','I2':'a','val':1},{'I1':'foo10','I2':'c','val':3}])
df = df.set_index(['I1','I2'])
sorted_df=df.sort_index(level=0,key=sort_index)
I have not found any simple solution on previous version of pandas
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.