[英]Remove random N number of rows based on conditions on multiple columns in pandas
df df
Text column Title Numbers column
0 abc rom-com 1
1 xyz comedy 2
2 hi rom-com 4
3 jkl murder 5
4 abc thriller 2
and so on................
What I want:我想要的是:
I want to remove 5 random rows where column Title has value rom-com and remove random 6 rows of column where title column has value 'murder'.我想删除标题列值为rom-com 的5 个随机行,并删除标题列值为'murder' 的随机列的6 行。
Code:代码:
df1 = df.drop(df[df['Title'].str.contains('rom-com')].sample(5).index & /
[df['Title'].str.contains('murder')].sample(6).index)
Error:错误:
AttributeError: 'list' object has no attribute 'sample'
Above code is working well for one title but not both together.上面的代码适用于一个标题,但不能同时使用。
df1 = df.drop(df[df['Title'].str.contains('rom-com')].sample(5).index \
#this alone works for both murder and rom-com separately.
But both together I am not able to remove rows corresponding to values in multiple columns.但是两者一起我无法删除与多列中的值相对应的行。
Index.union
是可能的:
df1 = df.drop(df[df['Title'].str.contains('rom-com')].sample(5).index.union(df[df['Title'].str.contains('murder')].sample(6).index))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.