简体   繁体   中英

PANDAS - Drop column based on value of cell in that column

I have a CSV with and I'm interested in dropping all the columns in the CSV where the value of a cell in each column equals 0. So I want to delete any column that has a 0 in the "Total Conservation (Gapped)" row. Any advice on how to achieve this?

Out[7]: 
                            Unnamed: 0    0        1  ... 1585 1586 1587
0                      HCoV_HKU1_spike    -        -  ...    x    x    x
1                 Rat_CoV_Parker_spike    -        -  ...    x    x    x
2                  Mouse_CoV_MHV_spike    -        -  ...    x    x    x
3                  Rat_CoV_HKU24_spike    x        -  ...    x    x    x
4                            EquineCoV    -        -  ...    x    x    x
5                     Rabbit_CoV_spike    -        -  ...    x    x    x
6                      HCoV_OC43_spike    -        -  ...    x    x    x
7                            CanineCoV    -        -  ...    x    x    x
8                     Bovine_CoV_spike    -        -  ...    x    x    x
9                  Hedgehog_CoV3_spike    x        x  ...    x    x    x
10                 Hedgehog_CoV2_spike    x        x  ...    x    x    x
11                 Hedgehog_CoV1_spike    x        x  ...    x    x    x
12                     HCoV_MERS_spike    x        x  ...    x    x    x
13        Tylonycteris_pachypus_BatCoV    x        x  ...    x    x    x
14             BCoV_Tylonycteris_spike    x        x  ...    x    x    x
15                   BCoV_SC2013_spike    -        -  ...    x    x    x
16                    BCoV_HKU25_spike    x        x  ...    x    x    x
17             BCoV_Pipistrellus_spike    x        x  ...    x    x    x
18                   BCoV_GD2013_spike    x        x  ...    x    x    x
19                BCoV_SARS_HKU3_spike    x        -  ...    x    x    x
20                  BCoV_HeB2013_spike    x        -  ...    x    x    x
21                   BCoV_YN2013_spike    -        -  ...    x    x    x
22                  BCoV_HuB2013_spike    -        -  ...    x    x    x
23          BCoV_SARS_like_WIV16_spike    -        -  ...    x    x    x
24                     HCoV_SARS_spike    -        -  ...    x    x    x
25           Civet_CoV_SARS_2004_spike    -        -  ...    x    x    x
26                     BCoV_BM48_spike    x        -  ...    x    x    x
27                  Pangolin_CoV_spike    -        -  ...    x    x    x
28                    HCoV_SARS2_spike    -        -  ...    x    x    x
29                   BCoV_RatG13_spike    -        -  ...    x    x    x
30                                 NaN  NaN      NaN  ...  NaN  NaN  NaN
31   SARS-Clade Conservation (gap inc)    0        0  ...    0    0    0
32   MERS-Clade Conservation (gap inc)    0        0  ...    0    0    0
33   OC43-Clade Conservation (gap inc)    0        0  ...    0    0    0
34                                 NaN  NaN      NaN  ...  NaN  NaN  NaN
35  SARS-Clade Conservation (ungapped)    0  #DIV/0!  ...    0    0    0
36  SARS-Clade Conservation (ungapped)    0        0  ...    0    0    0
37  SARS-Clade Conservation (ungapped)    0  #DIV/0!  ...    0    0    0
38                                 NaN  NaN      NaN  ...  NaN  NaN  NaN
39         Total Conservation (Gapped)    0        0  ...    0    0    0
40       Total Conservation (Ungapped)    0        0  ...    0    0    0

[41 rows x 1589 columns]

First and foremost, your dataframe is messy, look into how to "tidy-up" your dataframe (convert the entire first row into the columns and then you could just select on the total conservation column_ but for your messy dataframe try using.loc:

df2 = df.loc[(df['first column'] == 'Total Conservation (Gapped)') & (df['other interested rows'] != 0)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM