简体   繁体   English

根据 Pandas 中多列的条件删除随机 N 行

[英]Remove random N number of rows based on conditions on multiple columns in pandas

df df

    Text column  Title     Numbers column
0          abc   rom-com               1
1          xyz    comedy               2
2           hi   rom-com               4
3          jkl    murder               5
4          abc  thriller               2
and so on................

What I want:我想要的是:

I want to remove 5 random rows where column Title has value rom-com and remove random 6 rows of column where title column has value 'murder'.我想删除标题列值为rom-com 的5 个随机行,并删除标题列值为'murder' 的随机列6 行

Code:代码:

df1 = df.drop(df[df['Title'].str.contains('rom-com')].sample(5).index & /
[df['Title'].str.contains('murder')].sample(6).index)

Error:错误:

AttributeError: 'list' object has no attribute 'sample'

Above code is working well for one title but not both together.上面的代码适用于一个标题,但不能同时使用。

df1 = df.drop(df[df['Title'].str.contains('rom-com')].sample(5).index \
#this alone works for both murder and rom-com separately.

But both together I am not able to remove rows corresponding to values in multiple columns.但是两者一起我无法删除与多列中的值相对应的行。

Index.union是可能的:

df1 = df.drop(df[df['Title'].str.contains('rom-com')].sample(5).index.union(df[df['Title'].str.contains('murder')].sample(6).index))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM