[英]Replace row values by condition if they are in certain time range
I'm trying to replace certain values in a DataFrame row by two conditions.我试图用两个条件替换 DataFrame 行中的某些值。 First they must be in a certain time range.
首先,它们必须在一定的时间范围内。 Additionally, the value in this time range has to be in a list of values to be replaced.
此外,此时间范围内的值必须在要替换的值列表中。
My best attempt:我最好的尝试:
df = df[df.between_time('06:00', '20:00')].replace([0, 1, 2, 3], np.nan, inplace=True)
This is the error I get:这是我得到的错误:
ValueError: Boolean array expected for the condition, not object
The DataFrame looks like this: DataFrame 看起来像这样:
datetime![]() |
vehicles![]() |
---|---|
2021-01-01 00:00:00 ![]() |
13.0 ![]() |
2021-01-01 00:15:00 ![]() |
9.0 ![]() |
And so on...等等...
The main goal is to replace all values between 06:00 and 20:00 (8pm) with NaN, if they're <= 3.主要目标是用 NaN 替换 06:00 到 20:00(晚上 8 点)之间的所有值,如果它们 <= 3。
import pandas as pd
Firstly convert your 'datetime' column into datetime dtype by(If it is already as datetime[ns] then ignore this step):首先将您的 'datetime' 列转换为 datetime dtype(如果它已经是 datetime[ns] 则忽略此步骤):
df['datetime']=pd.to_datetime(df['datetime'])
Then make your 'datetime' column as an index(If it is already as index then ignore this step):然后将您的“日期时间”列作为索引(如果它已经作为索引,则忽略此步骤):
df=df.set_index('datetime')
Now make use of between_time()
method and apply()
method:现在使用
between_time()
方法和apply()
方法:
resultdf=df.between_time('00:06:00', '00:20:00')['vehicles'].apply(lambda x:np.nan if x<=3 else x)
Finally:最后:
resultdf.values.shape=(2,1)
df.loc[resultdf.index]=resultdf
Now if you print df
you will get your desired output现在如果你打印
df
你会得到你想要的 output
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.