简体   繁体   English

如何根据条件删除 Pandas 数据框中特定数量的随机行?

[英]How to delete specific number of random rows in Pandas dataframe based on condition?

I want to delete specific 'n' number of rows from a dataframe, where the rows to be deleted are chosen randomly.我想从数据框中删除特定的“n”行,其中要删除的行是随机选择的。 Also, it must select the rows based on a condition on particular column values.此外,它必须根据特定列值的条件选择行。

For example, my dataframe is as below:例如,我的数据框如下:

C1    C2    C3
1     0     a
2     1     b
3     0     c
4     0     d
5     0     e
6     1     f
7     1     g
8     1     h
9     0     i

Now, I want to remove n=2 rows randomly, that has a condition where C2==1 .现在,我想随机删除n=2行,条件是C2==1

The resultant frame can be as below:结果帧可以如下所示:

C1    C2    C3
1     0     a
3     0     c
4     0     d
5     0     e
6     1     f
8     1     h
9     0     i

or或者

C1    C2    C3
1     0     a
2     1     b
3     0     c
4     0     d
5     0     e
7     1     g
9     0     i

or maybe other possibles too.或者也许还有其他可能。 The question here dows shows to remove 'n' sentences randomly, but it doesn't include providding the condition. 这里的问题 dows 显示随机删除 'n' 个句子,但不包括提供条件。

Filter rows by boolean indexing with DataFrame.sample for random rows, last use drop :使用DataFrame.sample通过boolean indexing过滤行以获取随机行,最后使用drop

N = 2
df1 = df.drop(df[df['C2'].eq(1)].sample(N).index)
print (df1)
   C1  C2 C3
0   1   0  a
1   2   1  b
2   3   0  c
3   4   0  d
4   5   0  e
6   7   1  g
8   9   0  i

Or use np.random.choice for random index values:或者使用np.random.choice随机索引值:

df = df.drop(np.random.choice(df.index[df['C2'].eq(1)], N))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM