根据行和列条件保留 pandas dataframe 的行

Question

Hello I have a pandas dataframe that I want to clean.Here is an example:您好，我有一个 pandas dataframe 想要清洁。这是一个示例：

IDBILL身份证	IDBUYER IDBUYER	BILL账单	DATE日期
001 001	768787 768787	45 45	1897-07-24 1897-07-24
001 001	768787 768787	67 67	1897-07-24 1897-07-24
001 001	768787 768787	98 98	1897-07-24 1897-07-24
002 002	768787 768787	30 30	1897-07-24 1897-07-24
002 002	768787 768787	15 15	1897-07-24 1897-07-24
002 002	768787 768787	12 12	1897-07-24 1897-07-24
005 005	786545 786545	45 45	1897-08-19 1897-08-19
008 008	657676 657676	89 89	1989-09-23 1989-09-23
009 009	657676 657676	42 42	1989-09-23 1989-09-23
010 010	657676 657676	18 18	1989-09-23 1989-09-23
012 012	657676 657676	51 51	1990-03-10 1990-03-10
016 016	892354 892354	73 73	1990-03-10 1990-03-10
018 018	892354 892354	48 48	1765-02-14 1765-02-14
020 020	892354 892354	62 62	1765-02-14 1765-02-14

I want to delete the highest bills(and keep the lowest when the bills are made on the same day, by the same IDBUYER, and whose bills IDs follow each other. To get this:我想删除最高的账单（并在同一天由同一个 IDBUYER 制作账单时保持最低，并且其账单 ID 彼此跟随。要得到这个：

IDBILL身份证	IDBUYER IDBUYER	BILL账单	DATE日期
002 002	768787 768787	30 30	1897-07-24 1897-07-24
002 002	768787 768787	15 15	1897-07-24 1897-07-24
002 002	768787 768787	12 12	1897-07-24 1897-07-24
005 005	786545 786545	45 45	1897-08-19 1897-08-19
010 010	657676 657676	18 18	1989-09-23 1989-09-23
012 012	657676 657676	51 51	1990-03-10 1990-03-10
016 016	892354 892354	73 73	1990-03-10 1990-03-10
018 018	892354 892354	48 48	1765-02-14 1765-02-14
020 020	892354 892354	62 62	1765-02-14 1765-02-14

Thank you in advance先感谢您

Answer 1

One solution:一种解决方案：

df = df.sort_values('BILL')
df.loc[df.assign(cc = df.groupby(['DATE','IDBUYER',df.groupby(['DATE','IDBUYER'])['IDBILL'].transform(lambda x: x.diff().gt(1).cumsum())]).cumcount(),cc2 = df.groupby(['DATE','IDBUYER','IDBILL']).transform('count'),floor = lambda x: ~(x['cc'].floordiv(x['cc2'],axis=0).astype(bool)))['floor']].sort_index()

根据行和列条件保留 pandas dataframe 的行

问题描述

1 个解决方案

解决方案1
1 2021-05-19 16:55:22

根据行和列条件保留 pandas dataframe 的行

问题描述

1 个解决方案

解决方案1 1 2021-05-19 16:55:22

解决方案1
1 2021-05-19 16:55:22