简体   繁体   English

使用pandas将int转换为timedelta

[英]Converting ints to timedelta with pandas

I have some values in a pandas df that are positive and negative ints, and I want to convert them to timedeltas so I can put them into a DurationField in a Django model. 我在pandas df中有一些正负整数值,我想将它们转换为timedeltas,以便可以将它们放入Django模型的DurationField中。

             date  dep_time dep_delay  arr_time arr_delay cancelled carrier  \
103992 2014-05-11  10:13:00        -2  12:47:00       -13         0      B6   
103993 2014-05-11  19:29:00        -1  22:15:00       -24         0      B6   
103994 2014-05-11  11:17:00         5  13:55:00         9         0      B6   
103995 2014-05-11  07:36:00       -10  09:24:00       -18         0      B6   
103996 2014-05-11  13:40:00         0  16:47:00        10         0      B6   

       tailnum flight origin dest air_time distance duration  
103992  N630JB    925    JFK  TPA      137     1005     1013  
103993  N632JB    225    JFK  TPA      137     1005     1929  
103994  N635JB    127    EWR  MCO      126      937     1117  
103995  N637JB   1273    JFK  CHS       92      636     0736  
103996  N637JB    213    JFK  LGB      352     2465     1340  

With this data, I want to express dep_delay, arr_delay, air_time and duration as timedeltas, but I keep getting zeroed-out values? 有了这些数据,我想将dep_delay,arr_delay,air_time和duration表示为timedelta,但是我一直在获取零值吗? I'm using 我正在使用

data['air_time'] = pd.to_timedelta(data['air_time'], errors='coerce')

If you are getting all 00:00:00.000000 values, then your air_time values might be strings. 如果获取所有00:00:00.000000值,则air_time值可能是字符串。 (You can check the data type of the air_time column by inspecting data.info() . If the dtype says object then the values are Python objects (such as str s) instead of a NumPy integer data type. You can then confirm they are strings by inspecting set(map(type, data['air_time'])) .) (您可以通过检查data.info()来检查air_time列的数据类型。如果data.info()表示object则值是Python对象(例如str ),而不是NumPy整数数据类型。然后可以确认它们是通过检查set(map(type, data['air_time']))

If they are strings, you can convert them to ints first by using: 如果它们是字符串,则可以先使用以下命令将它们转换为int:

data['air_time'] = data['air_time'].astype(int)

If 137 means 137 minutes then use 如果137表示137分钟,请使用

data['air_time'] = pd.to_timedelta(data['air_time'], unit='m', errors='coerce')

If, on the other hand, 137 means 1 hour and 37 minutes, then use 另一方面,如果137表示1小时37分钟,则使用

data['air_time'] = pd.to_timedelta(
    (data['air_time']//100)*60 + (data['air_time'] % 100), unit='m', 
    errors='coerce')

The unit='m' argument tells pd.to_timedelta to interpret the values as minutes. unit='m'参数告诉pd.to_timedelta将值解释为分钟。

For example, 例如,

import pandas as pd

data = pd.DataFrame({'air_time':['137','137','126','92','352']})
data['air_time'] = data['air_time'].astype(int)
data['air_time'] = pd.to_timedelta(data['air_time'], unit='m', errors='coerce')

yields 产量

  air_time
0 02:17:00
1 02:17:00
2 02:06:00
3 01:32:00
4 05:52:00

Note that pd.to_timedelta can also accepts strings as input if the strings contain the desired units . 请注意, 如果字符串包含所需的单位 ,则pd.to_timedelta也可以接受字符串作为输入。 For example, 例如,

import pandas as pd

data = pd.DataFrame({'air_time':['137','137','126','92','352']})
data['air_time'] = data['air_time'] + ' minutes'
#       air_time
# 0  137 minutes
# 1  137 minutes
# 2  126 minutes
# 3   92 minutes
# 4  352 minutes

data['air_time'] = pd.to_timedelta(data['air_time'], errors='coerce')

yields the same result. 产生相同的结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM