简体   繁体   English

熊猫:如何使用between_time和毫秒?

[英]Pandas: how to use between_time with milliseconds?

Consider this: 考虑一下:

import pandas as pd
import numpy as np

idx2=[pd.to_datetime('2016-08-31 22:08:12.000'), 
     pd.to_datetime('2016-08-31 22:08:12.200'),
     pd.to_datetime('2016-08-31 22:08:12.400')]

test=pd.DataFrame({'value':[1,1,3], 'groups' : ['A',np.NaN,'A']},index=idx2)
    test
Out[27]: 
                        groups  value
2016-08-31 22:08:12.000      A      1
2016-08-31 22:08:12.200    NaN      1
2016-08-31 22:08:12.400      A      3

I need to only keep data between 22:08:12.200 and 22:08:12.400 , so I naturally use between_time : 我只需要在22:08:12.20022:08:12.400之间保存数据,所以我自然会在between_time使用:

test.between_time('22:08:12.200','22:08:12.400')

gives

ValueError: Cannot convert arg ['22:08:12.200'] to a time ValueError:无法将arg ['22:08:12.200']转换为时间

What is wrong here? 这有什么不对? How can I slice my dataframe based on time with millisecond information? 如何根据时间以毫秒信息对dataframe切片?

I am not sure why the direct string does not work, but it looks like something to do with a time conversion from the datetime which came from the string. 我不确定为什么直接字符串不起作用,但它看起来与来自字符串的datetime时间的时间转换有关。 But you can workaround with an explicit conversion to time as: 但您可以通过显式转换为time来解决:

Code: 码:

test.between_time(*pd.to_datetime(['22:08:12.200', '22:08:12.400']).time)

Test Code: 测试代码:

import pandas as pd
import numpy as np

idx2 = [
    pd.to_datetime('2016-08-31 22:08:12.000'),
    pd.to_datetime('2016-08-31 22:08:12.200'),
    pd.to_datetime('2016-08-31 22:08:12.400')]

test = pd.DataFrame(
    {'value': [1, 1, 3], 'groups': ['A', np.NaN, 'A']}, index=idx2)

print(test.between_time(
    *pd.to_datetime(['22:08:12.200', '22:08:12.400']).time))

Results: 结果:

                        groups  value
2016-08-31 22:08:12.200    NaN      1
2016-08-31 22:08:12.400      A      3

你可以使用标准的日期时间:

test.between_time(datetime.time(22,8,12,200000),datetime.time(22,8,12,400000),include_start=True,include_end=True)

You don't need to use between_time, you can slice directly on the index. 您不需要使用between_time,您可以直接在索引上切片。

test[(test.index >= '2016-08-31 22:08:12.200') & (test.index <='2016-08-31 22:08:12.400')]

For whatever reason, the follwing will NOT work when milliseconds are specified. 无论出于何种原因,当指定毫秒时,后续操作将不起作用。

# doesn't work with milliseconds:
test['2016-08-31 22:08:12':'2016-08-31 22:08:12']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM