简体   繁体   中英

How do I check if a DataFrame column value exists in any of multiple lists, and if not, fill another column?

I'm trying to fill df['Group'] with either 'boys' , 'girls' , or 'both' if their respective df['Color'] values exist within any of the lists or fill NaN in df['Group'] if the df['Color'] value doesn't exist in any of the lists.

I have this:

boys = ['Brown', 'Green']
girls = ['Violet', 'Verde']
both ['Black', 'White']

           Color | Group
    ---------------------
    0  | 'Brown' |   NaN
    1  | 'Green' |   NaN
    2  | 'Black' |   NaN
    3  | 'White' |   NaN
    4  | 'Verde' |   NaN
    5  | 'Purple'|   NaN
    6  | 'Violet'|   NaN

I want this:

           Color | Group
    ---------------------
    0  | 'Brown' |   'boys'
    1  | 'Green' |   'boys'
    2  | 'Black' |   'both'
    3  | 'White' |   'both'
    4  | 'Verde' |   'girls'
    5  | 'Purple'|   NaN
    6  | 'Violet'|   'girls'

You can create a dictionary:

dct = dict(boys = ['Brown', 'Green'],
           girls = ['Violet', 'Verde'],
           both = ['Black', 'White'])

dct = {i: k for k, v in dct.items() for i in v}

Output:

{'Brown': 'boys',
 'Green': 'boys',
 'Violet': 'girls',
 'Verde': 'girls',
 'Black': 'both',
 'White': 'both'}

Then you can use the method map :

df['Group'] = df['Color'].map(dct)

Output:

    Color  Group
0   Brown   boys
1   Green   boys
2   Black   both
3   White   both
4   Verde  girls
5  Purple    NaN
6  Violet  girls

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM