简体   繁体   中英

Gaussian kernel density smoothing for pandas.DataFrame.resample?

I am using pandas.DataFrame.resample to resample random events to 1 hour intervals and am seeing very stochastic results that don't seem to go away if I increase the interval to 2 or 4 hours. It makes me wonder whether Pandas has any type of method for generating a smoothed density kernel like a Gaussian kernel density method with an adjustable bandwidth to control smoothing. I'm not seeing anything in the documentation, but thought I would post here before posting on the developer list server since that is their preference. Scikit-Learn has precisely the Gaussian kernel density function that I want , so I will try to make use of it, but it would be a fantastic addition to Pandas.

Any help is greatly appreciated!

hourly[0][344:468].plot()

在此处输入图片说明

Pandas has the ability to apply an aggregation over a rolling window. The win_type parameter controls the window's shape. The center parameter can be set in order for the labels to be set at the center of the window, instead of the right edge. To do Gaussian smoothing:

hrly = pd.Series(hourly[0][344:468])
smooth = hrly.rolling(window=5, win_type='gaussian', center=True).mean(std=0.5)

http://pandas.pydata.org/pandas-docs/stable/computation.html#rolling

I have now found that this is option is available in pandas.stats.moments.ewma and it works quite nicely. Here are the results:

from pandas.stats.moments import ewma

hourly[0][344:468].plot(style='b')
ewma(hourly[0][344:468], span=35).plot(style='k')

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM