简体   繁体   中英

Pandas isin on many many columns

I want to select rows from my dataframe df where any of the many columns contains a value that's in a list my_list . There are dozens of columns, and there could be more in the future, so I don't want to iterate over each column in a list.

I don't want this:

# for loop / iteration
for col in df.columns:
    df.loc[df[col].isin(my_list), "indicator"] = 1

Nor this:

# really long indexing
df = df[(df.col1.isin(my_list) | (df.col2.isin(my_list) | (df.col3.isin(my_list) ... (df.col_N.isin(my_list)]  # ad nauseum

Nor do I want to reshape the dataframe from a wide to a long format.

I'm thinking (hoping) there's a way to do this in one line, applying the isin() to many columns all at once.

Thanks!

您可以使用DataFrame.isin() ,它是一个DataFrame方法,而不是字符串方法。

new_df = df[df.isin(my_list)]

Alternately you may try:

df[df.apply(lambda x: x.isin(mylist)).any(axis=1)]

OR

df[df[df.columns].isin(mylist)]

Even you don't need o create a list if not utmost necessary rather directly assign it as follows.

df[df[df.columns].isin([3, 12]).any(axis=1)]

After checking your efforts:

Example DataFrame:

>>> df
   col_1  col_2  col_3
0      1      1     10
1      2      4     12
2      3      7     18

List construct:

>>> mylist
[3, 12]

Solutions:

>>> df[df.col_1.isin(mylist) | df.col_2.isin(mylist) | df.col_3.isin(mylist)]
   col_1  col_2  col_3
1      2      4     12
2      3      7     18

>>> df[df.isin(mylist).any(axis=1)]
   col_1  col_2  col_3
1      2      4     12
2      3      7     18

or :

>>> df[df[df.columns].isin(mylist).any(axis=1)]
   col_1  col_2  col_3
1      2      4     12
2      3      7     18

Or :

>>> df[df.apply(lambda x: x.isin(mylist)).any(axis=1)]
   col_1  col_2  col_3
1      2      4     12
2      3      7     18

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM