[英]Dropping rows in a certain time interval
So we have a Pandas DataFrame with certain values at certain times.所以我们有一个 Pandas DataFrame 在特定时间具有特定值。
For example:例如:
@ts @value Glucose Diff smooth_diff new P N C1 C2
135 2021-10-29 11:16:00 167 167.0 -3.0 15.45 15.45 17.95 17.45 NaN 0.0
155 2021-10-29 12:56:00 162 162.0 -15.0 15.35 15.35 17.95 16.00 NaN 0.0
243 2021-10-29 20:16:00 133 133.0 0.0 15.25 15.25 19.85 15.75 NaN 0.0
245 2021-10-29 20:26:00 134 134.0 0.0 15.50 15.50 15.75 15.60 NaN 0.0
113 2021-10-29 09:26:00 130 130.0 1.0 16.75 16.75 0.00 21.70 NaN NaN
Now we want to drop the rows that are in an 1 hour time interval (the @ts column) of each other (so in this example we want to drop the row at 2021-10-29 20:26:00 as it is within an 1 hour time span of the previous one), but we can't seem to figure out a way to do this.现在我们要删除彼此间隔为 1 小时(@ts 列)的行(因此在此示例中,我们要删除 2021-10-29 20:26:00 的行,因为它在前一个的 1 小时时间跨度),但我们似乎无法找到一种方法来做到这一点。
Any help?有什么帮助吗?
Something like this might work:像这样的东西可能会起作用:
import pandas as pd
# create dataframe (only creating 2 cols for ease)
df = pd.DataFrame({
'@ts': ['2021-10-29 11:16:00', '2021-10-29 12:56:00', '2021-10-29 20:16:00',
'2021-10-29 20:26:00'],
'@value': [167, 162, 133, 134]
})
# split @ts column into separate columns - date(d) and time(t)
df[["d", "t"]] = df["@ts"].str.split(" ", expand=True)
# split time column into separate parts, hours, mins and secs
df[["h", "m", "s"]] = df["t"].str.split(":", expand=True)
# drop duplicates based on date and hour, keep the first row
df = df.drop_duplicates(subset=["d", "h"], keep="first")
Convert the column to datetime
.将列转换为
datetime
。 Subtract the time
with the previous row time
and then evaluate the total seconds
.用前一行
time
减去time
,然后计算total seconds
。 Calculate the abs
value and check if it's greater than 3600
or not to create a boolean mask.计算
abs
值并检查它是否大于3600
以创建 boolean 掩码。 Then, use the boolean mask to filter the required rows.然后,使用 boolean 掩码过滤所需的行。
df['@ts'] = pd.to_datetime(df['@ts'])
df = df[~(df['@ts'] - df['@ts'].shift()
).dt.total_seconds().fillna(np.inf).apply(abs).lt(3600)]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.