簡體   English   中英

以小時和分鍾為單位計算兩個 Pandas 列之間的時間差

[英]Calculate Time Difference Between Two Pandas Columns in Hours and Minutes

我在數據todate有兩列fromdatetodate

import pandas as pd

data = {'todate': [pd.Timestamp('2014-01-24 13:03:12.050000'), pd.Timestamp('2014-01-27 11:57:18.240000'), pd.Timestamp('2014-01-23 10:07:47.660000')],
        'fromdate': [pd.Timestamp('2014-01-26 23:41:21.870000'), pd.Timestamp('2014-01-27 15:38:22.540000'), pd.Timestamp('2014-01-23 18:50:41.420000')]}

df = pd.DataFrame(data)

我添加了一個新列diff ,以使用以下方法查找兩個日期之間的差異

df['diff'] = df['fromdate'] - df['todate']

我得到diff列,但它包含days ,當超過 24 小時時。

                   todate                fromdate                   diff
0 2014-01-24 13:03:12.050 2014-01-26 23:41:21.870 2 days 10:38:09.820000
1 2014-01-27 11:57:18.240 2014-01-27 15:38:22.540 0 days 03:41:04.300000
2 2014-01-23 10:07:47.660 2014-01-23 18:50:41.420 0 days 08:42:53.760000

如何將我的結果僅轉換為小時和分鍾(即天轉換為小時)?

Pandas 時間戳差異返回一個 datetime.timedelta 對象。 這可以通過使用 *as_type* 方法輕松轉換為小時,如下所示

import pandas
df = pandas.DataFrame(columns=['to','fr','ans'])
df.to = [pandas.Timestamp('2014-01-24 13:03:12.050000'), pandas.Timestamp('2014-01-27 11:57:18.240000'), pandas.Timestamp('2014-01-23 10:07:47.660000')]
df.fr = [pandas.Timestamp('2014-01-26 23:41:21.870000'), pandas.Timestamp('2014-01-27 15:38:22.540000'), pandas.Timestamp('2014-01-23 18:50:41.420000')]
(df.fr-df.to).astype('timedelta64[h]')

屈服,

0    58
1     3
2     8
dtype: float64

這讓我.astype()因為上面的.astype()解決方案對我不起作用。 但我找到了另一種方法。 沒有計時或任何東西,但可能對其他人有用:

t1 = pd.to_datetime('1/1/2015 01:00')
t2 = pd.to_datetime('1/1/2015 03:30')

print pd.Timedelta(t2 - t1).seconds / 3600.0

...如果你想要幾個小時。 要么:

print pd.Timedelta(t2 - t1).seconds / 60.0

...如果你想要幾分鍾。

更新:這里曾經有一條有用的評論提到使用.total_seconds()跨越多天的時間段。 既然它不見了,我已經更新了答案。

  • 如何將結果轉換為僅小時和分鍾
    • 接受的答案只返回days + hours 分鍾不包括在內。
  • 要提供小時和分鍾為hh:mmx hours y minutes ,需要額外的計算和字符串格式。
  • 此答案顯示了如何使用timedelta數學將總小時數或總分鍾數作為浮點數timedelta ,並且比使用.astype('timedelta64[h]')更快
  • Pandas 時間增量用戶指南
  • Pandas 時間序列/日期功能用戶指南
  • python timedelta objects :請參閱支持的操作。
  • 以下示例數據已經是datetime64[ns] dtype 需要使用pandas.to_datetime()轉換所有相關列。
import pandas as pd

# test data from OP, with values already in a datetime format
data = {'to_date': [pd.Timestamp('2014-01-24 13:03:12.050000'), pd.Timestamp('2014-01-27 11:57:18.240000'), pd.Timestamp('2014-01-23 10:07:47.660000')],
        'from_date': [pd.Timestamp('2014-01-26 23:41:21.870000'), pd.Timestamp('2014-01-27 15:38:22.540000'), pd.Timestamp('2014-01-23 18:50:41.420000')]}

# test dataframe; the columns must be in a datetime format; use pandas.to_datetime if needed
df = pd.DataFrame(data)

# add a timedelta column if wanted. It's added here for information only
# df['time_delta_with_sub'] = df.from_date.sub(df.to_date)  # also works
df['time_delta'] = (df.from_date - df.to_date)

# create a column with timedelta as total hours, as a float type
df['tot_hour_diff'] = (df.from_date - df.to_date) / pd.Timedelta(hours=1)

# create a colume with timedelta as total minutes, as a float type
df['tot_mins_diff'] = (df.from_date - df.to_date) / pd.Timedelta(minutes=1)

# display(df)
                  to_date               from_date             time_delta  tot_hour_diff  tot_mins_diff
0 2014-01-24 13:03:12.050 2014-01-26 23:41:21.870 2 days 10:38:09.820000      58.636061    3518.163667
1 2014-01-27 11:57:18.240 2014-01-27 15:38:22.540 0 days 03:41:04.300000       3.684528     221.071667
2 2014-01-23 10:07:47.660 2014-01-23 18:50:41.420 0 days 08:42:53.760000       8.714933     522.896000

其他方法

  • 其他資源中播客的一個注意事項, .total_seconds()是在核心開發人員休假時添加和合並的,並且不會被批准。
    • 這也是沒有其他.total_xx方法的原因。
# convert the entire timedelta to seconds
# this is the same as td / timedelta(seconds=1)
(df.from_date - df.to_date).dt.total_seconds()
[out]:
0    211089.82
1     13264.30
2     31373.76
dtype: float64

# get the number of days
(df.from_date - df.to_date).dt.days
[out]:
0    2
1    0
2    0
dtype: int64

# get the seconds for hours + minutes + seconds, but not days
# note the difference from total_seconds
(df.from_date - df.to_date).dt.seconds
[out]:
0    38289
1    13264
2    31373
dtype: int64

其他資源

%%timeit測試

import pandas as pd

# dataframe with 2M rows
data = {'to_date': [pd.Timestamp('2014-01-24 13:03:12.050000'), pd.Timestamp('2014-01-27 11:57:18.240000')], 'from_date': [pd.Timestamp('2014-01-26 23:41:21.870000'), pd.Timestamp('2014-01-27 15:38:22.540000')]}
df = pd.DataFrame(data)
df = pd.concat([df] * 1000000).reset_index(drop=True)

%%timeit
(df.from_date - df.to_date) / pd.Timedelta(hours=1)
[out]:
43.1 ms ± 1.05 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%%timeit
(df.from_date - df.to_date).astype('timedelta64[h]')
[out]:
59.8 ms ± 1.29 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM