简体   繁体   English

如何检查列表中的值是否存在于 dataframe 中?

[英]How to check if a value in the list exists in the dataframe?

Got a data frame that has 5 columns and a list that contains 20 values.得到一个包含 5 列的数据框和一个包含 20 个值的列表。

if the value in the list exactly matches any value in the columns then it has to append the value (list value) to an empty column.如果列表中的值与列中的任何值完全匹配,则它必须将 append 值(列表值)设置为空列。

list=["siper","glock","tip",............]

INPUT (DATAFRAME ) DF1:输入(数据框)DF1:

数据框

DESIRED OUTPUT:所需 OUTPUT:

数据框

My code to check if the value in the list exist in data frame.我的代码用于检查列表中的值是否存在于数据框中。

list=["siper","glock","tip",............]
df2=[]
for i in list:
  mask=np.column_stack([df[col]==i for col in df])
  df2.append(df.loc[mask.any(axis=1)])

The above code gives a list of all rows in the data frame if the value in the list matches any column but I am not sure how to append values in the list to column1 if there is any match.如果列表中的值与任何列匹配,上面的代码给出了数据框中所有行的列表,但我不确定如果有任何匹配项,如何将列表中的 append 值添加到 column1。 Also, I want to add "Unknow" to column1 if there is no match.另外,如果没有匹配项,我想将“Unknow”添加到 column1。

Try str.extract :尝试str.extract

lst = ['glock', 'siper']

df['D'] = df.apply(lambda x: x.str.extract(fr"\b({'|'.join(lst)})\b")
                              .bfill().iloc[0].fillna('unknown'), axis=1)
print(df)

# Output
                  A                B                       C        D
0            lfkdjs            siper              ldjkslkdjq    siper
1  the glock hammer     ldksqjflsdkj            dljkfdslkfjs    glock
2     lfdkslkdfjsdl    dflskjfsdlkjf                  tipper  unknown
3     fdlsjkfsldkjf  dlfjksdflkdsjfs  The glockmaster hammer  unknown

Try this:尝试这个:

df['column1'] = np.array(['unknown', *l])[np.max([df.apply(lambda col: col.str.contains(item)).mul(i+1).sum(axis=1) for i, item in enumerate(list)], axis=0)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM