简体   繁体   中英

How to sort pandas dataframe by two date columns

I have a pandas dataframe like this:

   column_year  column_Month  a_integer_column
0   2014        April         25.326531
1   2014        August        25.544554
2   2015        December      25.678261
3   2014        February      24.801187
4   2014        July          24.990338
...  ...           ...           ...
68  2018        November      26.024931
69  2017        October       25.677333
70  2019        September     24.432361
71  2020        February      25.383648
72  2020        January       25.504831

I now want to sort year column first and then month column, like this below:

   column_year  column_Month  a_integer_column
3   2014        February      24.801187
0   2014        April         25.326531
4   2014        July          24.990338
1   2014        August        25.544554
2   2015        December      25.678261
...  ...           ...            ...
69  2017        October       25.677333
68  2018        November      26.024931
70  2019        September     24.432361
72  2020        January       25.504831
71  2020        February      25.383648

How do i do this?

Let us try to_datetime + argsort :

df=df.iloc[pd.to_datetime(df.column_year.astype(str)+df.column_Month,format='%Y%B').argsort()]
   column_year column_Month  a_integer_column
3         2014     February         24.801187
0         2014        April         25.326531
4         2014         July         24.990338
1         2014       August         25.544554
2         2015     December         25.678261

You can change the column_Month column into a CategoricalDtype

Months = pd.CategoricalDtype([
    'January', 'February', 'March', 'April', 'May', 'June',
    'July', 'August', 'September', 'October', 'November', 'December'
], ordered=True)

df.astype({'column_Month': Months}).sort_values(['column_year', 'column_Month'])

    column_year column_Month  a_integer_column
3          2014     February         24.801187
0          2014        April         25.326531
4          2014         July         24.990338
1          2014       August         25.544554
2          2015     December         25.678261
69         2017      October         25.677333
68         2018     November         26.024931
70         2019    September         24.432361
72         2020      January         25.504831
71         2020     February         25.383648
df=df.sort_values(by=["column_year", "column_Month"], ascending=[True, True])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM