在特定时间间隔内删除行

Question

So we have a Pandas DataFrame with certain values at certain times.所以我们有一个 Pandas DataFrame 在特定时间具有特定值。

For example:例如：

    @ts               @value Glucose Diff   smooth_diff new P          N    C1  C2

135 2021-10-29 11:16:00 167  167.0  -3.0    15.45   15.45   17.95   17.45   NaN 0.0
155 2021-10-29 12:56:00 162  162.0  -15.0   15.35   15.35   17.95   16.00   NaN 0.0
243 2021-10-29 20:16:00 133  133.0  0.0     15.25   15.25   19.85   15.75   NaN 0.0
245 2021-10-29 20:26:00 134  134.0  0.0     15.50   15.50   15.75   15.60   NaN 0.0
113 2021-10-29 09:26:00 130  130.0  1.0     16.75   16.75   0.00    21.70   NaN NaN

Now we want to drop the rows that are in an 1 hour time interval (the @ts column) of each other (so in this example we want to drop the row at 2021-10-29 20:26:00 as it is within an 1 hour time span of the previous one), but we can't seem to figure out a way to do this.现在我们要删除彼此间隔为 1 小时（@ts 列）的行（因此在此示例中，我们要删除 2021-10-29 20:26:00 的行，因为它在前一个的 1 小时时间跨度），但我们似乎无法找到一种方法来做到这一点。

Any help?有什么帮助吗？

Answer 1

Something like this might work:像这样的东西可能会起作用：

import pandas as pd

# create dataframe (only creating 2 cols for ease)
df = pd.DataFrame({
    '@ts': ['2021-10-29 11:16:00', '2021-10-29 12:56:00', '2021-10-29 20:16:00', 
            '2021-10-29 20:26:00'],
    '@value': [167, 162, 133, 134]
})

# split @ts column into separate columns - date(d) and time(t)
df[["d", "t"]] = df["@ts"].str.split(" ", expand=True)

# split time column into separate parts, hours, mins and secs
df[["h", "m", "s"]] = df["t"].str.split(":", expand=True)
# drop duplicates based on date and hour, keep the first row
df = df.drop_duplicates(subset=["d", "h"], keep="first")

Answer 2

Convert the column to datetime .将列转换为datetime 。 Subtract the time with the previous row time and then evaluate the total seconds .用前一行time减去time ，然后计算total seconds 。 Calculate the abs value and check if it's greater than 3600 or not to create a boolean mask.计算abs值并检查它是否大于3600以创建 boolean 掩码。 Then, use the boolean mask to filter the required rows.然后，使用 boolean 掩码过滤所需的行。

df['@ts'] = pd.to_datetime(df['@ts'])
df = df[~(df['@ts'] - df['@ts'].shift()
          ).dt.total_seconds().fillna(np.inf).apply(abs).lt(3600)]

在特定时间间隔内删除行

问题描述

2 个解决方案

解决方案1
1 2021-06-04 15:25:19

解决方案2
1 2021-06-04 15:34:42

在特定时间间隔内删除行

问题描述

2 个解决方案

解决方案1 1 2021-06-04 15:25:19

解决方案2 1 2021-06-04 15:34:42

解决方案1
1 2021-06-04 15:25:19

解决方案2
1 2021-06-04 15:34:42