简体   繁体   中英

What is the R equivalent of pandas .resample() method?

This is the closest link I've found: https://stats.stackexchange.com/questions/5305/how-to-re-sample-an-xts-time-series-in-r

But I don't see anything about the different ways to aggregate the data (like mean, count, anonymous function) which you can do in pandas.

For my program, I'm trying to have a dataframe be resampled every 2 minutes and take the average of the 2 values at each interval. Thanks!

I found this topic looking for a R equivalent for pandas resample() but for xts object. I post a solution just in case, for a time delta of five minutes where ts is an xts object:

period.apply(ts, endpoints(ts, k=5, "minutes"), mean)

If you use data.table and lubridate it might look something like this

library(data.table)
library(lubridate)
#sample data
dt<-data.table(ts=seq(from=ymd('2015-01-01'), to=ymd('2015-07-01'),by='mins'), datum=runif(260641,0,100))

if you wanted to get the data from minute to hourly means you could do

 dt[,mean(datum),by=floor_date(ts,"hour")]

if you had a bunch of columns and you wanted all of them to be averaged you could do

dt[,lapply(.SD,mean),by=floor_date(ts,"hour")]

You can replace mean for any function you'd like. You can replace "hour" with "second", "minute", "hour", "day", "week", "month", "year". Well you can't go from minute to seconds as that would require magic but you can go from micro seconds to seconds anyway.

It is not possible to convert a series from a lower periodicity to a higher periodicity - eg weekly to daily or daily to 5 minute bars, as that would require magic.

-Jeffrey Ryan from xts manual.

I never learned xts so I don't know the syntax to do it with xts objects but that line is famous (or at least as famous as a line from a manual can be)

Have you looked into the R COIN package? Here is a tutorial that might help you figure out if this is what you are looking for: http://www.statmethods.net/stats/resampling.html

More information on the package can be found here: https://cran.r-project.org/web/packages/coin/coin.pdf

You could use reticulate to utilize pandas methods

require(reticulate)
pd <- import("pandas")

df <- r_to_py(df) #Transform to Pandas DataFrame
df = df$set_index(pd$DatetimeIndex(df['Date']))
#df_meidan_hours=df$resample('1H', how='median', closed='left', label='left')
df_meidan_hours=df$resample('1H',closed='left', label='left')$agg('median')
df_meidan_hours <- py_to_r(df_meidan_hours) #Transform back to r's data.frame

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM