簡體   English   中英

使用pandas或numpy進行時間序列數據索引

[英]time series data indexing using pandas or numpy

以下是我的OHLC 1分鍾數據。

2011-11-01,9:00:00,248.50,248.95,248.20,248.70
2011-11-01,9:01:00,248.70,249.00,248.65,248.85
2011-11-01,9:02:00,248.90,249.25,248.70,249.15
...
2011-11-01,15:03:00,250.25,250.30,250.05,250.15
2011-11-01,15:04:00,250.15,250.60,250.10,250.60
2011-11-01,15:15:00,250.55,250.55,250.55,250.55
2011-11-02,9:00:00,245.55,246.25,245.40,245.80
2011-11-02,9:01:00,245.85,246.40,245.75,246.35
2011-11-02,9:02:00,246.30,246.45,245.75,245.80
2011-11-02,9:03:00,245.75,245.85,245.30,245.35
...

我加載了數據,這是數據:

                          2       3       4       5
0_1                                                                    
2011-11-01 09:00:00  248.50  248.95  248.20  248.70
2011-11-01 09:01:00  248.70  249.00  248.65  248.85
2011-11-01 09:02:00  248.90  249.25  248.70  249.15
2011-11-01 09:03:00  249.20  249.60  249.10  249.60
2011-11-01 09:04:00  249.55  249.95  249.50  249.60

我想添加4列以下內容,以便使用groupby:

                          2       3       4       5    year month day time
0_1                                                                    
2011-11-01 09:00:00  248.50  248.95  248.20  248.70       0      0  0    0
2011-11-01 09:01:00  248.70  249.00  248.65  248.85       0      0  0    1
2011-11-01 09:02:00  248.90  249.25  248.70  249.15       0      0  0    2
2011-11-01 09:03:00  249.20  249.60  249.10  249.60       0      0  0    3
2011-11-01 09:04:00  249.55  249.95  249.50  249.60       0      0  0    4
....
2011-11-02 09:00:00  248.50  248.95  248.20  248.70       0      0  1    0
2011-11-02 09:01:00  248.70  249.00  248.65  248.85       0      0  1    1
2011-11-02 09:02:00  248.90  249.25  248.70  249.15       0      0  1    2
2011-11-02 09:03:00  249.20  249.60  249.10  249.60       0      0  1    3
2011-11-02 09:04:00  249.55  249.95  249.50  249.60       0      0  1    4

如何添加這種索引列?

先感謝您。

您可以使用dateutil庫中的relativedelta函數來完成此dateutil

from dateutil.relativedelta import relativedelta
start = df.index[0]
def func(item):
    delta = relativedelta(item, start)
    return (delta.years, delta.months, delta.days)

>>>> pd.DataFrame(list(df.index.map(func)),
                  index=df.index, columns=['year', 'month', 'day'])

                     year  month  day
0_1                                  
2011-11-01 09:00:00     0      0    0
2011-11-01 09:01:00     0      0    0
2011-11-01 09:02:00     0      0    0
2011-11-01 15:03:00     0      0    0
2011-11-01 15:04:00     0      0    0
2011-11-01 15:15:00     0      0    0
2011-11-02 09:00:00     0      0    1
2011-11-02 09:01:00     0      0    1
2011-11-02 09:02:00     0      0    1
2011-11-02 09:03:00     0      0    1

之后,您可以將其與索引上的DataFrame合並。

我不知道time欄代表什么? 紀要?

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM