簡體   English   中英

python pandas覆蓋時間戳

[英]python pandas overwriting the timestamps

我有以下格式的數據集。

420,426,2013-04-28T23:59:21,7,20
421,427,2013-04-28T23:59:21,13,12
422,428,2013-04-28T23:59:22,10,16
423,510,2013-04-28T23:59:22,0,1
424,511,2013-04-28T23:59:22,9,0
425,1,2013-04-29T00:04:21,19,5
426,2,2013-04-29T00:04:21,25,1
427,3,2013-04-29T00:04:22,14,7
428,4,2013-04-29T00:04:22,18,2

我正在使用熊貓,我們正在處理龐大的數據集。 我想將數據分為5分鍾的間隔。 我正在使用以下代碼來獲取組。

有沒有一種有效的方法可以用新組的時間戳替換原始數據集中的時間戳? 例如; 在此示例中,我們希望前五個實例使用相同的時間戳標記,這是適當組的時間戳。

import pandas as pd

from datetime import timedelta

from pandas.tseries.resample import TimeGrouper
file_name = os.path.join("..", "..", "Dataset", "all_rawdata.csv")

dataset=pd.read_csv(file_name,dtype{"ID":np.int32,"station":np.int32,"time":str,"slots":np.int32,"available":np.int32})

dataset['time'] =pd.to_datetime(dataset['time'])
dataset.set_index(dataset.time, inplace=True)


data1 = dataset.groupby(TimeGrouper('5Min'))

使用GroupBy對象的.transform方法:

import pandas
import numpy

dtindex = pandas.DatetimeIndex(
    start='2012-01-01', 
    end='2012-01-15', 
    freq='10s'
)

df = pandas.DataFrame(
    data=numpy.random.normal(size=(len(dtindex), 2)), 
    index=dtindex,
    columns=['A', 'B']
)
groups_5min = df.groupby(pandas.TimeGrouper('5Min'))
first_5_of_everything = groups_5min.transform(lambda g: g.head(5))
print(first_5_of_everything.head(20))


                            A         B
2012-01-01 00:00:00  1.596596  0.523592
2012-01-01 00:00:10 -0.922953  0.496072
2012-01-01 00:00:20  0.307187 -1.336588
2012-01-01 00:00:30  1.063472  0.700835
2012-01-01 00:00:40  0.818054 -2.150868
2012-01-01 00:05:00 -1.457456  0.239977   # <--- jumps ahead
2012-01-01 00:05:10 -0.918154  1.391162
2012-01-01 00:05:20  0.032661  0.197498
2012-01-01 00:05:30 -1.788646 -0.539537
2012-01-01 00:05:40 -0.147163  0.953631
2012-01-01 00:10:00  0.226996 -0.327286   # <--- jumps ahead
2012-01-01 00:10:10 -0.514218  0.053867
2012-01-01 00:10:20 -0.627977 -1.370492
2012-01-01 00:10:30 -0.217245 -0.979994
2012-01-01 00:10:40 -0.164559  0.799679
2012-01-01 00:15:00  0.155583 -1.489055   # <--- jumps ahead
2012-01-01 00:15:10 -1.557037 -1.285676
2012-01-01 00:15:20  0.555650  0.223248
2012-01-01 00:15:30 -0.619089  0.954938
2012-01-01 00:15:40  0.371026  2.906548

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM