删除 pandas dataframe 在最大索引处的某个值之后的行

Question

I have a pandas dataframe with rate look like below:我有一个 pandas dataframe，费率如下所示：

import numpy as np
import pandas as pd

num = np.repeat(12, 3)
num1 = np.repeat(11, 3)
num2 = np.repeat(7, 2)
num3 = np.repeat(10, 2)
num4 = np.repeat(7, 3)
num5 = np.repeat(9, 5)
num6 = np.repeat(3, 4)
num7 = np.repeat(7, 4)

df = pd.DataFrame(columns= ['rate'])
df['rate'] = num
df = pd.concat([df, pd.DataFrame(num1, columns=['rate'])])
df = pd.concat([df, pd.DataFrame(num2, columns=['rate'])])
df = pd.concat([df, pd.DataFrame(num3, columns=['rate'])])
df = pd.concat([df, pd.DataFrame(num4, columns=['rate'])])
df = pd.concat([df, pd.DataFrame(num5, columns=['rate'])])
df = pd.concat([df, pd.DataFrame(num6, columns=['rate'])])
df = pd.concat([df, pd.DataFrame(num7, columns=['rate'])])
df = df.reset_index(drop = True)
values = (7,9)

There can be more 7s or 9s.可以有更多的 7 或 9。 I would like to delete 2 rows after the end points (max index) of each run of 7 or 9. The expected result would look like below:我想在每次运行 7 或 9 的终点（最大索引）之后删除 2 行。预期结果如下所示：

num = np.repeat(12, 3)
num1 = np.repeat(11, 3)
num2 = np.repeat(7, 2)
num3 = np.repeat(7, 3)
num4 = np.repeat(9, 3)
num5 = np.repeat(3, 2)
num6 = np.repeat(7, 4)

dd = pd.DataFrame(columns= ['rate'])
dd['rate'] = num
dd = pd.concat([dd, pd.DataFrame(num1, columns=['rate'])])
dd = pd.concat([dd, pd.DataFrame(num2, columns=['rate'])])
dd = pd.concat([dd, pd.DataFrame(num3, columns=['rate'])])
dd = pd.concat([dd, pd.DataFrame(num4, columns=['rate'])])
dd = pd.concat([dd, pd.DataFrame(num5, columns=['rate'])])
dd = pd.concat([dd, pd.DataFrame(num6, columns=['rate'])])
dd = dd.reset_index(drop = True)

Any suggestion how can I do that?有什么建议我该怎么做？ Thank you for your time and effort!感谢您的时间和精力！

Answer 1

Here is one way to do it using Pandas shift method:下面是使用 Pandas 移位方法的一种方法：

# Setup
max_indices = df[(df["rate"] != df["rate"].shift(-1)) & (df["rate"].isin([7, 9]))].index
index = df.index.to_list()
new_index = []
start = 0

# Build new index
for idx in max_indices:
    new_index = new_index + index[start: idx + 1]
    start = idx + 3

dd = df.loc[new_index, :].reset_index(drop=True)

Then:然后：

删除 pandas dataframe 在最大索引处的某个值之后的行

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-11-27 09:13:35

删除 pandas dataframe 在最大索引处的某个值之后的行

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-11-27 09:13:35

解决方案1
0 已采纳 2022-11-27 09:13:35