简体   繁体   中英

Sorting data by day and month (ignoring year) python pandas

I found many questions similar to mine, but none of them answer it exactly ( this one comes closest, but it focusses on ruby).

I have a pandas DataFrame like this:

import pandas as pd
import numpy as np

df = pd.DataFrame({'Date': pd.date_range('2014-10-03', '2015-10-02', freq='1D'), 'Variable': np.random.randn(365)})
df.head()

Out[272]: 
        Date  Variable
0 2014-10-03  0.637167
1 2014-10-04  0.562135
2 2014-10-05 -1.069769
3 2014-10-06  0.556997
4 2014-10-07  0.253468

I want to sort the data from January 1st to December 31st, ignoring the year component of the Date column. The background is that I want to track changes in Variable over the year, but my period starts and ends in October.

I thought of creating a seperate column for month and year and then sorting by those. But I am unsure how to do this in a "correct" and concise way.

Expected output:

  Date   Variable
0 01-01  0.637167  # (Placeholder-values)
1 01-02  0.562135
2 01-03 -1.069769
3 01-04  0.556997
4 01-05  0.253468

argsort

yourdf=df.loc[df.Date.dt.strftime('%m%d').astype(int).argsort()]

You can create the day and month columns by simply doing the following

df = pd.DataFrame(data=pd.date_range('2014-10-03', '2015-10-02', freq='1D'), columns=['date'])
df['day'] = df['date'].apply(lambda x: x.day)
df['month'] = df['date'].apply(lambda x: x.month)

You could make it even more compact. But quick analysis, you can use the above.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM