简体   繁体   中英

Remove random N number of rows based on conditions on multiple columns in pandas

df

    Text column  Title     Numbers column
0          abc   rom-com               1
1          xyz    comedy               2
2           hi   rom-com               4
3          jkl    murder               5
4          abc  thriller               2
and so on................

What I want:

I want to remove 5 random rows where column Title has value rom-com and remove random 6 rows of column where title column has value 'murder'.

Code:

df1 = df.drop(df[df['Title'].str.contains('rom-com')].sample(5).index & /
[df['Title'].str.contains('murder')].sample(6).index)

Error:

AttributeError: 'list' object has no attribute 'sample'

Above code is working well for one title but not both together.

df1 = df.drop(df[df['Title'].str.contains('rom-com')].sample(5).index \
#this alone works for both murder and rom-com separately.

But both together I am not able to remove rows corresponding to values in multiple columns.

Index.union是可能的:

df1 = df.drop(df[df['Title'].str.contains('rom-com')].sample(5).index.union(df[df['Title'].str.contains('murder')].sample(6).index))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM