[英]How to sort multiindex column month names?
我有這個多索引df
:
YEARS_TMAX TMAX YEARS_TMAX TMAX YEARS_TMAX
MONTH April April August August December .....
CODE NAME
000130 RICA PLAYA 21.0 31.5 21.0 21.5 22.0
000132 PUERTO PIZARRO 12.0 33.8 12.0 32.4 11.0
000134 PAPAYAL 23.0 33.2 22.0 22.4 21.0
000135 EL SALTO 22.0 33.6 23.0 22.8 22.0
000136 CAÑAVERAL 16.0 32.7 15.0 33.1 11.0
... ... ... ... ...
158317 SUSAPAYA 19.0 17.6 19.0 17.3 21.0
158321 PALCA 16.0 19.3 17.0 19.8 16.0
158323 TALABAYA 12.0 17.6 13.0 17.5 13.0
158326 CAPAZO 17.0 13.6 17.0 13.0 19.0
158328 PAUCARANI 14.0 13.3 13.0 11.9 15.0
我想按月份名稱(首先是 TMAX 列)對列進行排序,如下所示:
TMAX YEARS_TMAX TMAX YEARS_TMAX TMAX
MONTH January January February February March .....
CODE NAME
000130 RICA PLAYA 22.0 31.5 23.0 27.5 23.0
000132 PUERTO PIZARRO 17.0 32.8 18.0 30.4 18.0
000134 PAPAYAL 25.0 32.2 26.0 28.4 25.0
000135 EL SALTO 26.0 31.6 26.0 26.8 26.0
000136 CAÑAVERAL 16.0 32.7 18.0 31.1 15.0
... ... ... ... ...
158317 SUSAPAYA 19.0 17.6 19.0 17.3 21.0
158321 PALCA 16.0 19.3 17.0 19.8 16.0
158323 TALABAYA 12.0 17.6 13.0 17.5 13.0
158326 CAPAZO 17.0 13.6 17.0 13.0 19.0
158328 PAUCARANI 14.0 13.3 13.0 11.9 15.0
所以我寫了這個代碼:來源: 在多索引中排序“日期”
dates = pd.to_datetime(df.columns.get_level_values(1), format='%B')
df.columns = [df.columns.get_level_values(0), dates]
df = df.sort_index(axis=1, level=1)
要按月份對列進行排序但dates
不是創建月份名稱, dates
是創建隨機日期。 我該如何解決這個問題?
提前致謝。
通過從calendar.month_name創建有序 dtype 來使用CategoricalDtype這將確保按排序正確排序。
month_dtype = pd.CategoricalDtype(categories=list(month_name), ordered=True)
df.columns = [df.columns.get_level_values(0),
df.columns.get_level_values(1).astype(month_dtype)]
df = df.sort_index(axis=1, level=[1, 0])
示例數據和導入:
from calendar import month_name
import pandas as pd
df = pd.DataFrame(
[[1, 2, 3, 4, 5, 6], [7, 8, 9, 10, 11, 12]],
columns=pd.MultiIndex.from_product([
['YEARS_TMAX', 'TMAX'],
['March', 'January', 'February']
])
)
df
排序前:
YEARS_TMAX TMAX
March January February March January February
0 1 2 3 4 5 6
1 7 8 9 10 11 12
df
排序后:
TMAX YEARS_TMAX TMAX YEARS_TMAX TMAX YEARS_TMAX
January January February February March March
0 5 2 6 3 4 1
1 11 8 12 9 10 7
datetime 方法也可以,但需要使用DatetimeIndex.strftime轉換回字符串:
df.columns = [df.columns.get_level_values(0),
pd.to_datetime(df.columns.get_level_values(1), format='%B')]
df = df.sort_index(axis=1, level=[1, 0])
# convert back to strings
df.columns = [df.columns.get_level_values(0),
df.columns.get_level_values(1).strftime('%B')]
df
:
TMAX YEARS_TMAX TMAX YEARS_TMAX TMAX YEARS_TMAX
January January February February March March
0 5 2 6 3 4 1
1 11 8 12 9 10 7
這種方法的缺點是級別 1 再次是一個字符串類型,它需要在任何需要更改排序的時間進行轉換,因為不希望按字典序排序。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.