根据列阈值选择熊猫数据框的行

Question

I have a pandas dataframe with a column "value" and a column "timestamp". 我有一个带有“值”列和“时间戳”列的熊猫数据框。 Now I would like to filter the rows according to thresholds of the timestamp. 现在，我想根据时间戳的阈值过滤行。 I have done the following: 我已经完成以下工作：

idx = df.index[df['timestamp'] >= start and df['timestamp'] <= end]
df = df.loc[idx]

df is the dataframe and start and end are two integers. df是数据帧， start和end是两个整数。

Somehow this does not work. 不知何故，这行不通。 I'm getting an error: 我收到一个错误：

ValueError: The truth value of a DataFrame is ambiguous. ValueError：DataFrame的真值不明确。 Use a.empty, a.bool(), a.item(), a.any() or a.all(). 使用a.empty，a.bool（），a.item（），a.any（）或a.all（）。

EDIT: There is a further problem. 编辑：还有一个问题。 start is a dataframe with only one value (one row, one column). start是一个只有一个值（一行，一列）的数据帧。 End is a dataframe with several rows and only one column (but I'm only interested in the last row). End是一个具有多行且只有一列的数据框（但我只对最后一行感兴趣）。 When I do the following 当我执行以下操作时

    print(end.iloc[-1])
    print(start.iloc[0])

I'm getting the following output 我得到以下输出

1508504026077
start_timestamp_milli    1508502348946
Name: 0, dtype: int64

When I then try to do print(df[column] >= start.iloc[0]) I'm getting an error: 然后，当我尝试执行print(df[column] >= start.iloc[0]) ，出现错误：

ValueError: Can only compare identically-labeled Series objects ValueError：只能比较标记相同的Series对象

Consequently, mask=(df['timestamp'] >= start & df['timestamp'] <= end) also failes. 因此， mask=(df['timestamp'] >= start & df['timestamp'] <= end)也失败。

Answer 1

IIUC IIUC

mask=(df['timestamp'] >= start & df['timestamp'] <= end)

df=df[mask]

根据列阈值选择熊猫数据框的行

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-11-03 17:24:46

根据列阈值选择熊猫数据框的行

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-11-03 17:24:46

解决方案1
1 已采纳 2017-11-03 17:24:46