简体   繁体   English

如何在两次之间获取熊猫数据框列中的值?

[英]How to get values in a pandas dataframe column between 2 times?

I have a dataset where the date_time column was separated into date and time.我有一个数据集,其中 date_time 列被分成日期和时间。 This is so date could be used separately from time in different scenarios.这样日期就可以在不同的场景中与时间分开使用。 But now I need to get the time values between 5:00 - 8:00.但现在我需要获取 5:00 - 8:00 之间的时间值。 I only find functions in pandas for datetimes.我只在 Pandas 中找到日期时间的函数。 Is there any way to ONLY get values from a time column?有没有办法只从时间列中获取值?

I think part of the issue is the data type for the time column.我认为问题的一部分是时间列的数据类型。 I have tried to remove the colon in the time value, so that 5:00 becomes 500. But I still am unable to choose the values I need.我试图删除时间值中的冒号,使 5:00 变为 500。但我仍然无法选择我需要的值。 I keep getting a Key error on 'time'.我一直在“时间”上收到关键错误。

Here is what I tried so far:这是我到目前为止尝试过的:

# Get bird sightings between 5-8am. Remove the colon in time first.
early_birds_df = france_df['time'].str.replace(':','')

# Convert time to a numeric data type, so we can treat it like a number
early_birds_df['time'] = pd.to_numeric(early_birds_df['time'], errors='coerce')
early_birds_df.head()

But this returns an error:但这会返回一个错误:

KeyError                                  Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2897             try:
-> 2898                 return self._engine.get_loc(casted_key)
   2899             except KeyError as err:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()

KeyError: 'time'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
3 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2898                 return self._engine.get_loc(casted_key)
   2899             except KeyError as err:
-> 2900                 raise KeyError(key) from err
   2901 
   2902         if tolerance is not None:

KeyError: 'time'

Here is a code snippet to use as an example.这是用作示例的代码片段。 I want to use the `time' column and it has an index of sorts.我想使用“时间”列,它有一个排序索引。 Let's say I want to retrieve all rows that are between the times of 1:00 and 3:10.假设我想检索 1:00 和 3:10 之间的所有行。 What code can I use to do that?我可以使用什么代码来做到这一点?

        date        time
1       8/15/2013   0:18
2       8/15/2013   0:48
3       8/15/2013   1:17
4       8/15/2013   1:47
5       8/15/2013   2:17
6       8/15/2013   2:47
7       8/15/2013   3:02
8       8/15/2013   3:17
9       8/15/2013   3:32
10      8/15/2013   3:47

If the times are between hourly values, then you can use (for your example of 5:00 and 8:00)如果时间在每小时值之间,那么您可以使用(例如 5:00 和 8:00)

df[df["date_time"].dt.hour.between(5,8)]

To be more general you can use pandas.DatetimeIndex.indexer_between_time but this requires converting your timestamp series to a DatetimeIndex first, ie更一般地说,您可以使用pandas.DatetimeIndex.indexer_between_time但这需要首先将您的时间戳系列转换为DatetimeIndex ,即

df["date_time"].iloc[pd.DatetimeIndex(df["date_time"]).indexer_between_time("05:00", "08:00")]

or you can convert the times to their corresponding timedeltas since the start of the day, and then compare against timedelta values, eg或者您可以将时间转换为自一天开始以来相应的 timedeltas,然后与 timedelta 值进行比较,例如

time = df["date_time"] - df["date_time"].dt.floor("D")
df[time.between(pd.Timedelta("05:00:00"), pd.Timedelta("08:00:00"))]

edit编辑

Just saw the new data format with time column.刚刚看到带有time列的新数据格式。 In that case you can append seconds to the strings so that we can work with to_timedelta , eg在这种情况下,您可以将秒附加到字符串,以便我们可以使用to_timedelta ,例如

pd.to_timedelta(df["time"] + ":00").between(pd.to_timedelta("05:00:00"), pd.to_timedelta("08:00:00"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在 pandas DataFrame 列中获取 datetime.times 之间的差异 - How to take difference between datetime.times in a pandas DataFrame column 如何使用 pandas 获取 DataFrame 中特定时间之间的数据 - How to use pandas to get data in between certain times within a DataFrame 如何获取 pandas dataframe 中特定日期/时间的所有历史值? - How to get all the historical values for particular days/times in pandas dataframe? 如何在 DataFrame 列中定义的值之间汇总 DataFrame Pandas 中的值? - How to summarize values in DataFrame between defined values in column in Python Pandas? Python Pandas:如何在DataFrame列中现有值之间填充值? - Python pandas: how to fill values between existing ones in dataframe column? 如何按两列值之间的行对熊猫数据框进行切片? - How to slice a pandas dataframe by rows between two column values? 如何找到熊猫数据框列中一组值之间的相关性 - How to find the correlation between a group of values in a pandas dataframe column 如何在熊猫数据框中两次隔离,然后仅在较大的熊猫数据框中修改这些值? - How isolate between two times in pandas data frame and then modify just those values in a larger pandas dataframe? 计算熊猫数据框中列值之间的距离 - Calculating distance between column values in pandas dataframe Pandas 数据框列值之间的比较 - Comparison between Pandas dataframe column values
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM