I have a CSV with and I'm interested in dropping all the columns in the CSV where the value of a cell in each column equals 0. So I want to delete any column that has a 0 in the "Total Conservation (Gapped)" row. Any advice on how to achieve this?
Out[7]:
Unnamed: 0 0 1 ... 1585 1586 1587
0 HCoV_HKU1_spike - - ... x x x
1 Rat_CoV_Parker_spike - - ... x x x
2 Mouse_CoV_MHV_spike - - ... x x x
3 Rat_CoV_HKU24_spike x - ... x x x
4 EquineCoV - - ... x x x
5 Rabbit_CoV_spike - - ... x x x
6 HCoV_OC43_spike - - ... x x x
7 CanineCoV - - ... x x x
8 Bovine_CoV_spike - - ... x x x
9 Hedgehog_CoV3_spike x x ... x x x
10 Hedgehog_CoV2_spike x x ... x x x
11 Hedgehog_CoV1_spike x x ... x x x
12 HCoV_MERS_spike x x ... x x x
13 Tylonycteris_pachypus_BatCoV x x ... x x x
14 BCoV_Tylonycteris_spike x x ... x x x
15 BCoV_SC2013_spike - - ... x x x
16 BCoV_HKU25_spike x x ... x x x
17 BCoV_Pipistrellus_spike x x ... x x x
18 BCoV_GD2013_spike x x ... x x x
19 BCoV_SARS_HKU3_spike x - ... x x x
20 BCoV_HeB2013_spike x - ... x x x
21 BCoV_YN2013_spike - - ... x x x
22 BCoV_HuB2013_spike - - ... x x x
23 BCoV_SARS_like_WIV16_spike - - ... x x x
24 HCoV_SARS_spike - - ... x x x
25 Civet_CoV_SARS_2004_spike - - ... x x x
26 BCoV_BM48_spike x - ... x x x
27 Pangolin_CoV_spike - - ... x x x
28 HCoV_SARS2_spike - - ... x x x
29 BCoV_RatG13_spike - - ... x x x
30 NaN NaN NaN ... NaN NaN NaN
31 SARS-Clade Conservation (gap inc) 0 0 ... 0 0 0
32 MERS-Clade Conservation (gap inc) 0 0 ... 0 0 0
33 OC43-Clade Conservation (gap inc) 0 0 ... 0 0 0
34 NaN NaN NaN ... NaN NaN NaN
35 SARS-Clade Conservation (ungapped) 0 #DIV/0! ... 0 0 0
36 SARS-Clade Conservation (ungapped) 0 0 ... 0 0 0
37 SARS-Clade Conservation (ungapped) 0 #DIV/0! ... 0 0 0
38 NaN NaN NaN ... NaN NaN NaN
39 Total Conservation (Gapped) 0 0 ... 0 0 0
40 Total Conservation (Ungapped) 0 0 ... 0 0 0
[41 rows x 1589 columns]
First and foremost, your dataframe is messy, look into how to "tidy-up" your dataframe (convert the entire first row into the columns and then you could just select on the total conservation column_ but for your messy dataframe try using.loc:
df2 = df.loc[(df['first column'] == 'Total Conservation (Gapped)') & (df['other interested rows'] != 0)]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.