简体   繁体   中英

How to check if a value in the list exists in the dataframe?

Got a data frame that has 5 columns and a list that contains 20 values.

if the value in the list exactly matches any value in the columns then it has to append the value (list value) to an empty column.

list=["siper","glock","tip",............]

INPUT (DATAFRAME ) DF1:

数据框

DESIRED OUTPUT:

数据框

My code to check if the value in the list exist in data frame.

list=["siper","glock","tip",............]
df2=[]
for i in list:
  mask=np.column_stack([df[col]==i for col in df])
  df2.append(df.loc[mask.any(axis=1)])

The above code gives a list of all rows in the data frame if the value in the list matches any column but I am not sure how to append values in the list to column1 if there is any match. Also, I want to add "Unknow" to column1 if there is no match.

Try str.extract :

lst = ['glock', 'siper']

df['D'] = df.apply(lambda x: x.str.extract(fr"\b({'|'.join(lst)})\b")
                              .bfill().iloc[0].fillna('unknown'), axis=1)
print(df)

# Output
                  A                B                       C        D
0            lfkdjs            siper              ldjkslkdjq    siper
1  the glock hammer     ldksqjflsdkj            dljkfdslkfjs    glock
2     lfdkslkdfjsdl    dflskjfsdlkjf                  tipper  unknown
3     fdlsjkfsldkjf  dlfjksdflkdsjfs  The glockmaster hammer  unknown

Try this:

df['column1'] = np.array(['unknown', *l])[np.max([df.apply(lambda col: col.str.contains(item)).mul(i+1).sum(axis=1) for i, item in enumerate(list)], axis=0)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM