简体   繁体   中英

How to search entire Pandas dataframe for a string and get the name of the column that contains it?

I want to find the name of the column in a dataframe ("categories") that contains a given string.

categories

    Groceries   Electricity Fastfood    Parking 
0   SHOP        ELCOMPANY   MCDONALDS   park
1   MARKET      ELECT       Subway      car
2   market      electr      Restauran   247 

Say I want to search this entire dataframe for string "MCDO". The answer should be "Fastfood" . I tried using str.contains but it doesn't seem to work for dataframes.

How can I achieve this? Thank you.

You can check with contains with any

df.apply(lambda x : x.str.contains('MCDO')).any().loc[lambda x : x].index
Index(['Fastfood'], dtype='object')

Or use:

print(df.apply(lambda x: x.str.contains('MCDO')).replace(False,np.nan).dropna(axis=1,how='all').columns.item())

Output:

Fastfood

If you can search for the entire string, it makes it easier,

(df == 'MCDONALDS').any().idxmax()

else use apply,

df.apply(lambda x: x.str.startswith('MCDO').any()).idxmax()

One can also use for loop for this:

def strfinder(df, mystr):
    for col in df:
        for item in df[col]:
            if mystr in item:
                return col

print(strfinder(df, 'MCDO'))

To get all columns that may have the string, eg in modified dataframe below:

    Groceries   Electricity  Fastfood    Parking 
0   SHOP        ELCOMPANY   MCDONALDS   park
1   MARKET      MCDON       Subway      car
2   market      electr      Restauran   247 

one can use "list comprehension":

mystr = 'MCDO'
outlist = [ col 
            for col in df 
            for item in df[col]
            if mystr in item    ]
print(outlist)

Output:

['Electricity', 'Fastfood']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM