简体   繁体   中英

pandas dataframe getting daily/weekly/hourly data

I have a pandas dataframe with index = datetime.datetime(year,month,day,hour,minute)

I want to be able to use this to get hourly/ daily /weekly data where hourly data would be the last entry corresponding to every hour in the frame.

Is there any inbuilt way to do this? I tried making cases eg in case of daily data, i changed the hour and minute entry to zero, but I still have a dataframe with multiple entries for the same day. How can I get the last entry corresponding to each day?

sample dataframe:

         index                x          y
2016-01-01 00:07:00-05:00   1.000      0.000
2016-01-01 00:10:00-05:00   1.000      0.000
2016-01-01 00:15:00-05:00   1.000      0.000
2016-01-01 00:16:00-05:00   1.000      0.000
2016-01-01 00:20:00-05:00   1.000      0.000
2016-01-01 00:21:00-05:00   1.000      0.000
2016-01-01 00:26:00-05:00   1.000      0.000
2016-01-01 00:31:00-05:00   1.000      0.000
2016-01-01 00:37:00-05:00   1.000      0.000
2016-01-01 00:40:00-05:00   1.000      0.000
2016-01-01 00:46:00-05:00   1.000      0.000
2016-01-01 00:51:00-05:00   1.000      0.000
2016-01-01 00:56:00-05:00   1.000      0.000
2016-01-03 19:26:00-05:00   1.000      0.000
2016-01-03 19:34:00-05:00   1.000      0.000
2016-01-03 20:02:00-05:00   1.000      0.000
2016-01-03 20:06:00-05:00   1.000      0.000
2016-01-03 20:07:00-05:00   1.000      0.000
2016-01-03 20:08:00-05:00   1.000      0.000
2016-01-03 20:10:00-05:00   1.000      0.000
2016-01-03 20:11:00-05:00   1.000      0.000
2016-01-03 20:12:00-05:00   1.000      0.000
2016-01-03 20:13:00-05:00   1.000      0.000

Assuming I understand your question (it would be helpful to see some code for examples), it sounds like you could use resample:

df.resample('D', how='sum')

It works like a groupby or pivot table:

DataFrame.resample(rule, how=None, axis=0, fill_method=None, closed=None, label=None, convention='start', kind=None, loffset=None, limit=None, base=0) Convenience method for frequency conversion and resampling of regular time-series data.

Parameters: rule : string the offset string or object representing target conversion axis : int, optional, default 0 closed : {'right', 'left'} Which side of bin interval is closed label : {'right', 'left'} Which bin edge label to label bucket with convention : {'start', 'end', 's', 'e'} loffset : timedelta Adjust the resampled time labels base : int, default 0 For frequencies that evenly subdivide 1 day, the “origin” of the aggregated intervals. For example, for '5min' frequency, base could range from 0 through 4. Defaults to 0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM