[英]pandas remove rows from dataframe based on multiple conditions without for loops
[英]Pandas - remove rows based on two conditions
我有一個這樣的熊貓數據框-
ColA ColB ColC
Apple 2019-03-02 18:00:00 Saturday
Orange 2019-03-03 10:00:00 Sunday
Mango 2019-03-04 09:00:00 Monday
我試圖根據某些條件從日期框架中刪除行。
如果日期時間是上午9點及以上和下午5點及以下,請刪除該行。
如果是周末(星期六和星期日),請不要刪除它。
預期的輸出在數據框中將沒有芒果。
似乎比我想的要難
s1=df.ColB.dt.hour.between(9,17,inclusive=False)
df.loc[s1|df.ColC.isin(['Saturday','Sunday'])]
ColA ColB ColC
0 Apple 2019-03-02 18:00:00 Saturday
1 Orange 2019-03-03 10:00:00 Sunday
或使用
s1=pd.Index(df.ColB).indexer_between_time('09:00:00','17:00:00',include_start =False ,include_end =False)
s1=df.index.isin(s1)
df.loc[s1|df.ColC.isin(['Saturday','Sunday'])]
要給出另一種選擇,您可以這樣編寫:
cond1 = df.ColB.dt.hour >= 9 # After 09:00
cond2 = df.ColB.dt.hour <= 15 # Before 16:00
cond3 = df.ColB.dt.weekday < 5 # Mon-Fri
df = df[~(cond1&cond2&cond3)]
完整示例:
import pandas as pd
df = pd.DataFrame({
'ColA': ['Apple','Orange','Mango'],
'ColB': pd.to_datetime([
'2019-03-02 18:00:00',
'2019-03-03 10:00:00',
'2019-03-04 09:00:00'
]),
'ColC': ['Saturday', 'Sunday', 'Monday']
})
cond1 = df.ColB.dt.hour >= 9 # After 09:00
cond2 = df.ColB.dt.hour <= 15 # Before 16:00
cond3 = df.ColB.dt.weekday < 5 # Mon-Fri
df = df[~(cond1&cond2&cond3)] # conditions mark the rows to drop, hence ~
print(df)
返回:
ColA ColB ColC
0 Apple 2019-03-02 18:00:00 Saturday
1 Orange 2019-03-03 10:00:00 Sunday
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.