简体   繁体   中英

How to use vlook up in python to find text in a dataframe?

I would like to use something like vlook-up/map function in python.

I have only a portion of entire name of some companies. i would like to know if the company is into the dataset, as the follow example.

在此处输入图片说明

Thank you

df1['in DATASET'] = df1['NAME'].isin(df2['FULL DATASET'])

I can recreate the results checking one list against another. It's not very clear or logical what your match criteria are. "john usa" is a successful match with "aviation john" on the basis that "john" appears in both. But would "john usa" constitute a match with "usa mark sas" since "usa" appears in both? What about hyphens, comma's, etc? It would help if this was cleared up.

In any case, I hope the following will help, good luck:-

#create two lists of tuples based on the existing dataframes.
check_list = list(df_check.to_records(index=False))
full_list = list(df_full.to_records(index=False))

#create a set - entries in a set are unique
results=set()

for check in check_list: #for each record to check...
    for search_word in check[0].split(" "): #take the first column and split it into its words using space as a delimiter
        found=any(search_word in rec[0] for rec in full_list) #is the word a substring of any of the records in full list? True or False
        results.add((check[0], found)) #add the record we checked to the set with the result (the set avoids duplicate entries)
#build a dataframe based on the results
df_results=df(results, columns=["check", "found"])

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM