简体   繁体   English

如何将导入python的数据从csv文件转换为时间序列?

[英]How can I turn data imported into python from a csv file to time-series?

I want to turn data imported into python through a .csv file to time-series. 我想通过.csv文件将导入python的数据转换为时间序列。

GDP = pd.read_csv('GDP.csv')

[87]: GDP
   Out[87]: 
  GDP growth (%)
0              0.5
1             -5.2
2             -7.9
3             -9.1
4            -10.3
5             -8.8
6             -7.4
7            -10.1
8             -8.4
9             -8.7
10            -7.9
11            -4.1

Since data imported through a .csv file are into a DataFrame format, I first tried turning them into pd.Series: 由于通过.csv文件导入的数据是DataFrame格式,我首先尝试将它们转换为pd.Series:

GDP2 = pd.Series(data = GDP, index = pd.date_range(start = '01-2010', end = '01-2018', freq = 'Q'))

But what I got looked like this: 但是我看起来像这样:

GDP2
Out[90]: 
2010-03-31    (G, D, P,  , g, r, o, w, t, h,  , (, %, ))
2010-06-30    (G, D, P,  , g, r, o, w, t, h,  , (, %, ))
2010-09-30    (G, D, P,  , g, r, o, w, t, h,  , (, %, ))
2010-12-31    (G, D, P,  , g, r, o, w, t, h,  , (, %, ))
2011-03-31    (G, D, P,  , g, r, o, w, t, h,  , (, %, ))
2011-06-30    (G, D, P,  , g, r, o, w, t, h,  , (, %, ))
2011-09-30    (G, D, P,  , g, r, o, w, t, h,  , (, %, ))
2011-12-31    (G, D, P,  , g, r, o, w, t, h,  , (, %, ))
2012-03-31    (G, D, P,  , g, r, o, w, t, h,  , (, %, ))
2012-06-30    (G, D, P,  , g, r, o, w, t, h,  , (, %, ))
2012-09-30    (G, D, P,  , g, r, o, w, t, h,  , (, %, ))
2012-12-31    (G, D, P,  , g, r, o, w, t, h,  , (, %, ))

The same happened when I tried to do that through a pd.DataFrame: 当我尝试通过pd.DataFrame执行此操作时也是如此:

GDP2 = pd.DataFrame(data = GDP, index = pd.date_range(start = '01-2010', end = '01-2018', freq = 'Q'))

GDP2
Out[92]: 
        GDP growth (%)
2010-03-31             NaN
2010-06-30             NaN
2010-09-30             NaN
2010-12-31             NaN
2011-03-31             NaN
2011-06-30             NaN
2011-09-30             NaN
2011-12-31             NaN
2012-03-31             NaN
2012-06-30             NaN
2012-09-30             NaN

Or when I tried this through the use of reindex(): 或者当我通过使用reindex()尝试这个时:

dates = pd.date_range(start = '01-2010', end = '01-2018', freq = 'Q')

dates
Out[100]: 
DatetimeIndex(['2010-03-31', '2010-06-30', '2010-09-30', '2010-12-31',
           '2011-03-31', '2011-06-30', '2011-09-30', '2011-12-31',
           '2012-03-31', '2012-06-30', '2012-09-30', '2012-12-31',
           '2013-03-31', '2013-06-30', '2013-09-30', '2013-12-31',
           '2014-03-31', '2014-06-30', '2014-09-30', '2014-12-31',
           '2015-03-31', '2015-06-30', '2015-09-30', '2015-12-31',
           '2016-03-31', '2016-06-30', '2016-09-30', '2016-12-31',
           '2017-03-31', '2017-06-30', '2017-09-30', '2017-12-31'],
          dtype='datetime64[ns]', freq='Q-DEC')

GDP.reindex(dates)

Out[101]: 
       GDP growth (%)
2010-03-31             NaN
2010-06-30             NaN
2010-09-30             NaN
2010-12-31             NaN
2011-03-31             NaN
2011-06-30             NaN
2011-09-30             NaN
2011-12-31             NaN
2012-03-31             NaN
2012-06-30             NaN
2012-09-30             NaN
2012-12-31             NaN

I'm surely making some stupid, newbie mistake and I would really appreciate it if someone could help me out here. 我肯定会犯一些愚蠢的,新手的错误,如果有人能帮到我,我真的很感激。 Cheers. 干杯。

Use set_index 使用set_index

df
    gdp
0   0.5
1   -5.2
2   -7.9
3   -9.1
4   -10.3
5   -8.8
6   -7.4
7   -10.1
8   -8.4
9   -8.7
10  -7.9
11  -4.1

df = df.set_index(pd.date_range(start = '01-2010', end = '01-2013',freq = 'Q'))

            gdp
2010-03-31  0.5
2010-06-30  -5.2
2010-09-30  -7.9
2010-12-31  -9.1
2011-03-31  -10.3
2011-06-30  -8.8
2011-09-30  -7.4
2011-12-31  -10.1
2012-03-31  -8.4
2012-06-30  -8.7
2012-09-30  -7.9
2012-12-31  -4.1

To fix your code add values 修复代码添加values

GDP2 = pd.DataFrame(data = GDP.values, index = pd.date_range(start = '01-2010', end = '01-2013',freq = 'Q'))
GDP2
Out[71]: 
               0
2010-03-31   0.5
2010-06-30  -5.2
2010-09-30  -7.9
2010-12-31  -9.1
2011-03-31 -10.3
2011-06-30  -8.8
2011-09-30  -7.4
2011-12-31 -10.1
2012-03-31  -8.4
2012-06-30  -8.7
2012-09-30  -7.9
2012-12-31  -4.1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何调整python上的时间序列数据的大小 - How can I resize on time-series data on python 如何使用Tkinter可视化时间序列数据? - How can I use Tkinter to visualise time-series data? Python - 如何规范化时间序列数据 - Python - how to normalize time-series data 我怎样才能将时间序列数据从昨天移到今天的大熊猫? - how can i move time-series data from yesterday to today in pandas? 如何使用 Pandas 自动绘制来自非常大的时间序列的多个“块”数据? - How can I automate the plotting of multiple 'chunks' of data from a very large time-series using Pandas? Enthought Python和MatPlotLib中的乱码.csv时间序列数据 - Garbled .csv time-series data in Enthought Python and MatPlotLib 如何合并来自 2 个不同 csv 的时间序列数据 - How can I merge time series data from 2 different csv 根据现有时间序列数据进行插值-Python - Interpolate from Existing Time-series Data - Python 从mysql导入时间戳+数据到python并绘制时间序列 - importing timestamps + data from mysql to python and plotting time-series 我可以使用哪些库在 Python 中的时间序列数据中进行异常检测? - What libraries I can use for Anomaly detection in Time-series data in Python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM