簡體   English   中英

如何修改熊貓數據框中索引中“日期”更改的時間(00:00:00)?

[英]How to modify the time that 'date' changes (00:00:00) in an index in Pandas dataframe?

我有一個看起來像這樣的數據框:

Date and Time           Close   dif
2015/01/01 17:00:00.211 2030.25 0.3
2015/01/01 17:00:02.456 2030.75 0.595137615
2015/01/01 23:55:01.491 2037.25 2.432613592
2015/01/02 00:02:01.955 2036.75 -0.4
2015/01/02 00:04:04.887 2036.5  -0.391144414
2015/01/02 15:14:56.207 2021.5  -4.732676608
2015/01/02 15:14:59.020 2021.5  -4.731171953
2015/01/02 15:30:00.020 2022    -4.228169436
2015/01/02 16:13:18.948 2021.25 -4.96153033
2015/01/02 16:15:00.000 2021    -5.210187988
2015/01/04 17:00:00.105 2020.5  0
2015/01/04 17:00:01.077 2021    0.423093923

如何修改索引,使當前日期從前一天的17:00:00開始並在15:15:00結束。 (可以刪除15:15:00和17:00:00之間的數據)。

新的數據框如下所示:

Date and Time           Close   dif
2015/01/02 17:00:00.211 2030.25 0.3
2015/01/02 17:00:02.456 2030.75 0.595137615
2015/01/02 23:55:01.491 2037.25 2.432613592
2015/01/02 00:02:01.955 2036.75 -0.4
2015/01/02 00:04:04.887 2036.5  -0.391144414
2015/01/02 15:14:56.207 2021.5  -4.732676608
2015/01/02 15:14:59.020 2021.5  -4.731171953
2015/01/05 17:00:00.105 2020.5  0
2015/01/05 17:00:01.077 2021    0.423093923

謝謝

這是你想要的?

# read in your dataframe
import pandas as pd
df = pd.read_csv('dt_data.csv', skipinitialspace=True)
df.columns = ['mydt', 'close', 'dif'] # changed your column name to 'mydt'
df.mydt = pd.to_datetime(df.mydt) # convert mydt to datetime so we can operate on it

# keep times outside [15:15 to 17:00] interval
df = df[(((df.mydt.dt.hour >= 15) & (df.mydt.dt.minute > 15)) 
                                  | (df.mydt.dt.hour == 16))==False]

# increment the day count for hours >= 17 at start of new 'day'
ndx = df[df.mydt.dt.hour>=17].index
df.ix[ndx, 'mydt'] += pd.Timedelta(days=1)

df.set_index('mydt', inplace=True, drop=True)
print(df)

                           close       dif
mydt                                      
2015-01-02 17:00:00.211  2030.25  0.300000
2015-01-02 17:00:02.456  2030.75  0.595138
2015-01-02 00:02:01.955  2036.75 -0.400000
2015-01-02 00:04:04.887  2036.50 -0.391144
2015-01-02 15:14:56.207  2021.50 -4.732677
2015-01-02 15:14:59.020  2021.50 -4.731172
2015-01-05 17:00:00.105  2020.50  0.000000
2015-01-05 17:00:01.077  2021.00  0.423094

編輯:解決評論中的groupby問題。 如果您只需要訪問上面的datetime列mydt的日期部分,則可以執行以下操作:

df.reset_index(inplace=True)
print(df.mydt.dt.date)

0    2015-01-02
1    2015-01-02
2    2015-01-02
3    2015-01-02
4    2015-01-02
5    2015-01-02
6    2015-01-05
7    2015-01-05
dtype: object

然后您可以僅使用日期部分進行分組操作

print(df.groupby(df.mydt.dt.date)['dif'].sum())

2015-01-02   -9.359855
2015-01-05    0.423094
Name: dif, dtype: float64

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM