简体   繁体   中英

removing columns from pandas dataframe using a condition

I have a pandas dataframe df

Fruit   Apple   Orange   Banana  Pear

basket1   0        1       10     15

basket2   1        5        7     10

basket3   10       15       0      0

I want to remove columns(fruit type) based on the following two conditions

If the sum of the fruits in basket1,basket2 and basket3 is less than 20, remove the column. The result in this case is

Fruit    Orange   Pear
basket1   1       15
basket2   5       10
basket3   15      0

In the above result, I want to further remove columns if the number of baskets having >0 fruit is less than 3. The result expected is

Fruit    Orange   
basket1   1       
basket2   5       
basket3   15    

Can you help me to write code for this. I know how to get the sum of every fruit in each basket as df.sum(axis =0).I am unable to proceed from this point.

You can use this condition:

df.sum().gt(20) for total sum; df.gt(0).sum().ge(3) for positive items count.

df = df.set_index('Fruit')

df
#        Apple  Orange  Banana  Pear
#Fruit              
#basket1     0       1      10    15
#basket2     1       5       7    10
#basket3    10      15       0     0

df.loc[:, df.sum().gt(20) & df.gt(0).sum().ge(3)]

#       Orange
#Fruit  
#basket1     1
#basket2     5
#basket3    15

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM