簡體   English   中英

計算DataFrame Pandas中'times'行之間的差異

[英]Calculate difference between 'times' rows in DataFrame Pandas

我的DataFrame在表單中:

       TimeWeek   TimeSat  TimeHoli
0      6:40:00   8:00:00   8:00:00
1      6:45:00   8:05:00   8:05:00
2      6:50:00   8:09:00   8:10:00
3      6:55:00   8:11:00   8:14:00
4      6:58:00   8:13:00   8:17:00
5      7:40:00   8:15:00   8:21:00

我需要在TimeWeek,TimeSat和TimeHoli中找到每一行之間的時差,輸出必須是

TimeWeekDiff   TimeSatDiff  TimeHoliDiff
00:05:00          00:05:00       00:05:00
00:05:00          00:04:00       00:05:00
00:05:00          00:02:00       00:04:00  
00:03:00          00:02:00       00:03:00
00:02:00          00:02:00       00:04:00 

我嘗試使用(d['TimeWeek']-df['TimeWeek'].shift().fillna(0) ,它會拋出一個錯誤:

TypeError: unsupported operand type(s) for -: 'str' and 'str'

可能是因為列中存在':'。 我該如何解決這個問題?

看起來錯誤是因為數據是字符串而不是時間戳。 首先將它們轉換為時間戳:

df2 = df.apply(lambda x: [pd.Timestamp(ts) for ts in x])

默認情況下,它們將包含今天的日期,但是一旦你區分時間,這一點無關緊要(希望你不必擔心日期差異23:55和00:05)。

轉換后,只需區分DataFrame即可:

>>> df2 - df2.shift()
   TimeWeek  TimeSat  TimeHoli
0       NaT      NaT       NaT
1  00:05:00 00:05:00  00:05:00
2  00:05:00 00:04:00  00:05:00
3  00:05:00 00:02:00  00:04:00
4  00:03:00 00:02:00  00:03:00
5  00:42:00 00:02:00  00:04:00

根據您的需要,您可以采取行1+(忽略NaT):

(df2 - df2.shift()).iloc[1:, :]

或者你可以用零填充NaT:

(df2 - df2.shift()).fillna(0)

忘記我剛才說的一切。 熊貓有很好的timedelta解析。

df["TimeWeek"] = pd.to_timedelta(df["TimeWeek"])
(d['TimeWeek']-df['TimeWeek'].shift().fillna(pd.to_timedelta("00:00:00"))
>>> import pandas as pd
>>> df = pd.DataFrame({'TimeWeek': ['6:40:00', '6:45:00', '6:50:00', '6:55:00', '7:40:00']})
>>> df["TimeWeek_date"] = pd.to_datetime(df["TimeWeek"], format="%H:%M:%S")
>>> print df
  TimeWeek       TimeWeek_date
0  6:40:00 1900-01-01 06:40:00
1  6:45:00 1900-01-01 06:45:00
2  6:50:00 1900-01-01 06:50:00
3  6:55:00 1900-01-01 06:55:00
4  7:40:00 1900-01-01 07:40:00
>>> df['TimeWeekDiff'] = (df['TimeWeek_date'] - df['TimeWeek_date'].shift().fillna(pd.to_datetime("00:00:00", format="%H:%M:%S")))
>>> print df
  TimeWeek       TimeWeek_date  TimeWeekDiff
0  6:40:00 1900-01-01 06:40:00      06:40:00
1  6:45:00 1900-01-01 06:45:00      00:05:00
2  6:50:00 1900-01-01 06:50:00      00:05:00
3  6:55:00 1900-01-01 06:55:00      00:05:00
4  7:40:00 1900-01-01 07:40:00      00:45:00

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM