简体   繁体   中英

Pandas dataframe drop multiple rows based on datetime difference

I store datetimes in a pandas dataframe which look like dd/mm/yyyy hh:mm:ss

I want to drop all rows where values in column x (datetime) are within 24 hours of one another.

On a 1 by 1 basis, I was previously doing this, which doesn't seem to work within the drop function:

df.drop(df[(df['d2'] - df['d1']).seconds / 3600 < 24].index)
>> AttributeError: 'Series' object has no attribute 'seconds'

This should work

df.loc[ (df.d2 - df.d1) >= datetime.timedelta(days=1) ]

the answer is very easy

import pandas as pd
df = pd.read_csv("test.csv")
df["d1"] = pd.to_datetime(df["d1"])
df["d2"] = pd.to_datetime(df["d2"])

now if you tried to subtract columns from each other

df["first"] - df["second"]

output will be in days and hence and as what @kaan suggested

df.loc[(df["d2"] - df["d1"]) >= pd.Timedelta(days=1)] 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM