简体   繁体   中英

Create column based on a defined list if other column contains string

Sorry for the ambiguous title...

I have a pre-defined list

list = ['ABC', 'DEF', 'GHI', 'JKL']

I have the following table

Item#      Value
  1        'DEF-1234'
  2        'some_text_ABC'
  3        'GHI_1"
  4        'Item is: ABC'
  5        '7713-23-88'
  6        'ABC'

And I'd like to extract the following keys based on that 'Value' column

Item#      Value              Key
  1        'DEF-1234'         'DEF'
  2        'some_text_ABC'    'ABC'
  3        'GHI_1"            'GHI'
  4        'Item is: ABC'     'ABC'
  5        '7713-23-88'        NA
  6        'ABC'              'ABC'

I'm thinking of some sort of elif loop, but I suspect that Pandas has something up its sleeve

Thank you very much in advance

You can do a str.extract :

df['Key'] = df['Value'].str.extract(fr'({"|".join(lst)})')[0]

Output:

   Item#            Value  Key
0      1       'DEF-1234'  DEF
1      2  'some_text_ABC'  ABC
2      3          'GHI_1"  GHI
3      4   'Item is: ABC'  ABC
4      5     '7713-23-88'  NaN
5      6            'ABC'  ABC

You can define a function that scans through all the keys and then use apply() to create the new column.

li = ['ABC', 'DEF', 'GHI', 'JKL']

def check_key(st):
    keys = [el for el in li if el in st]
    return  keys[0] if keys else 'NA'

df['key'] = df['Value'].apply(check_key)

Output:

           Value  key
0       DEF-1234  DEF
1  some_text_ABC  ABC
2          GHI_1  GHI
3   Item is: ABC  ABC
4     7713-23-88   NA
5            ABC  ABC

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM