简体   繁体   中英

resample time series data pandas

I have a csv file divided every 10 minutes that informs me of the number of passengers per line, but I have a gap from 1 pm to 4:50 it does not have a registration, how can I fill it with the number of passengers 0

dataset

You could create a new dataframe with the dates and number of passengers you want by using pd.date_range to create the dates:

>>> start_date = pd.to_datetime("2021-11-08 00:50:00.000")
>>> end_date = pd.to_datetime("2021-11-08 05:00:00.000")

The keyword argument inclusive is available from 1.4.0 and forward. For previous versions, you'll have to add the 10 minutes to start_date and subtract the same time amount to end_date , since both values would be included by default in your date range:

>>> target_date_range = pd.date_range(start=start_date, end=end_date, freq="10min", inclusive="neither")

Now you can create your dataframe with the new rows and use .concat to include your original data:

>>> new_rows_df = pd.DataFrame({
    "date": target_date_range,
})
>>> new_rows_df["passengers"] = 0
>>> new_rows_df.head()
                 date  passengers
0 2021-11-08 01:00:00           0
1 2021-11-08 01:10:00           0
2 2021-11-08 01:20:00           0
3 2021-11-08 01:30:00           0
4 2021-11-08 01:40:00           0

>>> df = pd.concat([df, new_rows_df])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM