简体   繁体   English

使用熊猫计算不规则时间序列的每日平均值

[英]Calculating daily average from irregular time series using pandas

I am trying to obtain daily averages from an irregular time series from a csv-file.我试图从 csv 文件的不规则时间序列中获取每日平均值。

The data in the csv-file start at 13:00 on 20 September 2013 and run till 10:57 on 14 January 2014: csv 文件中的数据从 2013 年 9 月 20 日的 13:00 开始,一直运行到 2014 年 1 月 14 日的 10:57:

Time                    Values
20/09/2013 13:00        5.133540
20/09/2013 13:01        5.144993
20/09/2013 13:02        5.158208
20/09/2013 13:03        5.170542
20/09/2013 13:04        5.167899    
20/09/2013 13:25        5.168780
20/09/2013 13:26        5.179351
...

I import them with:我导入它们:

import pandas as pd
data = pd.read_csv('<file name>', parse_dates={'Timestamp':'Time']},index_col='Timestamp')

This results in这导致

                           Values
Timestamp                          
2013-09-20 13:00:00        5.133540
2013-09-20 13:01:00        5.144993
2013-09-20 13:02:00        5.158208
2013-09-20 13:03:00        5.170542
2013-09-20 13:04:00        5.167899
2013-09-20 13:25:00        5.168780
2013-09-20 13:26:00        5.179351
...

And then I do然后我做

dataDailyAv = data.resample('D', how = 'mean')

This results in这导致

                  Values
Timestamp                 
2013-01-10        8.623744
2013-01-11             NaN
2013-01-12             NaN
2013-01-13             NaN
2013-01-14             NaN
...

In other words, the result contains dates that do not appear in the original data, and for some of these dates (eg 10 January 2013), there even appears a value.换句话说,结果包含原始数据中没有出现的日期,并且对于其中一些日期(例如 2013 年 1 月 10 日),甚至会出现一个值。

Any ideas about what is going wrong?关于出了什么问题的任何想法?

Thanks.谢谢。

Edit: apparently something goes wrong with the parsing of the date: 01/10/2013 is interpreted as 10 January 2013 instead of 1 October 2013. This can be solved by editing the date format in the csv-file, but is there a way to specify the date format in read_csv?编辑:显然日期解析出了问题:01/10/2013 被解释为 2013 年 1 月 10 日而不是 2013 年 10 月 1 日。这可以通过编辑 csv 文件中的日期格式来解决,但有没有办法在read_csv 中指定日期格式?

您需要dayfirst=True ,这是read_csv docs 中列出的众多调整之一。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 时间序列 - 在 python 中计算没有 29.02 的每日平均值 - time series - calculating the daily average without 29.02 in python 从熊猫的不规则时间序列中生成规则时间序列 - Generating regular time series from irregular time series in pandas Pandas:及时计算事件周围时间序列数据的平均行为 - Pandas: calculating average behaviour of time series data around an event in time 当 pandas 中的时间增量索引不规则时,如何获取时间序列值的每日差异? - How to get daily difference in time series values when time delta index is irregular in pandas? Pandas 重采样不规则时间序列 - Pandas resampling irregular time series 下采样大熊猫的不规则时间序列 - Downsampling irregular time series in pandas Python - Pandas 系列 - 盘中数据 - 日均 - Python - Pandas Series - Intraday Data - Daily Average Pandas 使用其他不规则时间列表重新采样和插入不规则时间序列 - Pandas resample and interpolate an irregular time series using a list of other irregular times 如何将不规则时间序列重新采样到每日频率,并将其延伸到今天? - How to resample irregular time series to daily frequency and have it span to today? 如何从熊猫时间序列中生成每日列表 - How to generate daily lists from pandas time series
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM