I have a large dataset with a date column (which is not the index) with the following format %Y-%m-%d %H:%M:%S
.
I would like to create quarterly subsets of this data frame ie the data frame dfQ1
would contain all rows where the date was between month [1 and 4], dfQ2
would contain all rows where the date was between month [5 and 8], etc... The header of the subsets is the same as that of the main data frame.
How can I do this?
Thanks!
I would add a new column containing quarterly information, ie:
from datetime import datetime
date_format = "%Y-%m-%d %H:%M:%S"
date_to_qtr = lambda dt: 1 + (datetime.strptime(dt, date_format).month-1) // 3
df['qtr'] = df['date'].apply(date_to_qtr)
(using the floordiv function). Then index on the new column:
dfQ1 = df[df.qtr == 1]
dfQ2 = df[df.qtr == 2]
dfQ3 = df[df.qtr == 3]
dfQ4 = df[df.qtr == 4]
Or, by then you can just use groupby, df.groupby("qtr")
(see docs ).
Using pandas, you can first create a datetime column and then create a quarter column using the date/time quarter attribute :
from datetime import datetime
date_format = "%Y-%m-%d %H:%M:%S"
df['datetime'] = [datetime.strptime(dt, date_format) for dt in df['date']]
df['quarter'] = [dt.quarter for dt in df['datetime']]
From there you can subset the dataframe with groupby ( df.groupby('quarter')
) or by indexing:
dfQ1 = df[df.quarter == 1]
dfQ2 = df[df.quarter == 2]
dfQ3 = df[df.quarter == 3]
dfQ4 = df[df.quarter == 4]
Assuming you're using Pandas.
dfQ1 = df[(df.date > Qstartdate) & (df.date < Qenddate)]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.