简体   繁体   中英

Pandas Aggregate Daily Data to Monthly Timeseries

I have a time series that looks like this (below)

And I want to resample it monthly, so it has 2019-10 is equal to the average of all the values of october, November is the average of all the PTS values for November, etc.

However, when i use the pd.resample('M').mean() method, if the final day for each month does not have a value, it fills in a Nan in my data frame. How do I solve this?

Date        PTS    
2019-10-23  14.0
2019-10-26  14.0
2019-10-27   8.0
2019-10-29  29.0
2019-10-31  17.0
2019-11-03  12.0
2019-11-05   2.0
2019-11-07  15.0
2019-11-08   7.0
2019-11-14  16.0
2019-11-16  12.0
2019-11-20  22.0
2019-11-22   9.0
2019-11-23  20.0
2019-11-25  18.0```

这行得通吗?

pd.resample('M').mean().dropna()

Do you have a code sample? This works:

import pandas as pd
import numpy as np

rng = np.random.default_rng()
days = np.arange(31)

data = pd.DataFrame({"dates": np.datetime64("2019-03-01") + rng.choice(days, 60),
                     "values": rng.integers(0, 60, size=60)})

data.set_index("dates", inplace=True)

# Set the last day to null.
data.loc["2019-03-31"] = np.nan

# This works
data.resample("M").mean()

It also works with an incomplete month:

incomplete_days = np.arange(10)

data = pd.DataFrame({"dates": np.datetime64("2019-03-01") + rng.choice(incomplete_days, 10),
                     "values": rng.integers(0, 60, size=10)})

data.set_index("dates", inplace=True)

data.resample("M").mean()

You should check your data and types more thoroughly in case the NaN you're receiving indicates a more pressing issue.

为什么不直接删除 NaN 值?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM