I am struggling with the basics. I have just one column with names in pandas dataframe and I want to compare strings for potential duplicates using 3-4 functions from fuzzywuzzy library. So first name I want to check against the rest of the column content, then 2nd name and so on. Column will have hundreds if not thousands of names. I want to create a df with combination of names for which at least one of the values is above 80.
Do I need to create a list out of that df? Apologies, I know it is very basic I just can't seem to find a solution myself.
So in the end I found a different approach to my issue. Instead of doing 80k vs 80k list I have used a function called itertools.combinations which gives you unique combinations which is perfect in this scenario.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.