I have a df with many columns and I want to select all rows that have any 2 column-values nonzero. filtering on rows where specific columns are nonzero is trivial, I mean this is probably trivial as well but it's eluding me. have
/ want
dataframes for reproduceable example below:
import pandas as pd
have = pd.DataFrame({'ID': [1,2,3,4,5],
'grp1': [1,0,0,0,0],
'grp2': [0,2,0,0,0],
'grp3': [1,1,0,0,5]})
want = have.iloc[0:2]
>>> want
ID grp1 grp2 grp3
0 1 1 0 1
1 2 0 2 1
last note - I pass back and forth between pandas and sql(ite) a lot, so if there's a more elegant way to do this in sql than pandas that I'm missing, I'm open to that too.
In pandas:
have[have.iloc[:,1:].gt(0).sum(1).gt(1)]
ID grp1 grp2 grp3
0 1 1 0 1
1 2 0 2 1
IIUC, you need:
have.loc[(have.iloc[:,1:].=0).sum(1)>1]
as you said non-zero values.
You can convert the column type to bool then zero will be Fasle
and non zero will be True
and sum the True
to get non zero count.
out = have[have.filter(like='grp').astype(bool).sum(axis=1).ge(2)]
print(out)
ID grp1 grp2 grp3
0 1 1 0 1
1 2 0 2 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.