简体   繁体   中英

How to delete specific number of random rows in Pandas dataframe based on condition?

I want to delete specific 'n' number of rows from a dataframe, where the rows to be deleted are chosen randomly. Also, it must select the rows based on a condition on particular column values.

For example, my dataframe is as below:

C1    C2    C3
1     0     a
2     1     b
3     0     c
4     0     d
5     0     e
6     1     f
7     1     g
8     1     h
9     0     i

Now, I want to remove n=2 rows randomly, that has a condition where C2==1 .

The resultant frame can be as below:

C1    C2    C3
1     0     a
3     0     c
4     0     d
5     0     e
6     1     f
8     1     h
9     0     i

or

C1    C2    C3
1     0     a
2     1     b
3     0     c
4     0     d
5     0     e
7     1     g
9     0     i

or maybe other possibles too. The question here dows shows to remove 'n' sentences randomly, but it doesn't include providding the condition.

Filter rows by boolean indexing with DataFrame.sample for random rows, last use drop :

N = 2
df1 = df.drop(df[df['C2'].eq(1)].sample(N).index)
print (df1)
   C1  C2 C3
0   1   0  a
1   2   1  b
2   3   0  c
3   4   0  d
4   5   0  e
6   7   1  g
8   9   0  i

Or use np.random.choice for random index values:

df = df.drop(np.random.choice(df.index[df['C2'].eq(1)], N))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM