is there a way, to count all rows which contain at least one '1' in a dataframe checking multiple named columns?

Question

I have a dataset filled with Medicare beneficiaries. The question is: 'What proportion of patients have at least one of the chronic conditions described in the independent variables alzheimers, arthritis, cancer, copd, depression, diabetes, heart.failure, ihd, kidney, osteoporosis, and stroke?'

I tried creating a subset and using isnull() & any(), but i can´t get a proper solution.. also tried df.loc but it only lets me name one column..

i am attaching the dataset for better understanding!

https://drive.google.com/file/d/1R--YEsBCDHMXjqNzAumT2zzUAYvM1bWA/view?usp=sharing

Thanks!!

My Try´s:

claimss.loc[:, ["alzheimers","diabetes","arthritis"] == 1]

(Wanted to try it first with 3 columns, doesn´t work in the first place..)

Try with Subset:

filtered_df = df.loc[raw_df] == 1]

(Created a Subset where only index and independet variables(diseases) appear and tried to look for null-rows)

Answer 1

If need filter only some columns names use subset for filter columns names, compare by 1 by DataFrame.eq and last test at least one True by DataFrame.any :

claimss[claimss[["alzheimers","diabetes","arthritis"]].eq(1).any(axis=1)]

If need percentage use mean with boolean mask:

out = claimss[["alzheimers","diabetes","arthritis"]].eq(1).any(axis=1).mean()

is there a way, to count all rows which contain at least one '1' in a dataframe checking multiple named columns?

Question

1 answers

solution1
1 ACCPTED 2022-07-18 11:13:21

is there a way, to count all rows which contain at least one '1' in a dataframe checking multiple named columns?

Question

1 answers

solution1 1 ACCPTED 2022-07-18 11:13:21

solution1
1 ACCPTED 2022-07-18 11:13:21