简体   繁体   中英

How Can I Find Out Similar or Estimated Point Names From a list of Addresses By NLP Or Any Better Solution?

I have a list of 500K different type of Addresses and Have also List of specific Point names in BD. You want to find out these Point names according to the addresses. But there is a problem, Many Point names are not spelled correctly in the addresses;

like - Wrong Spelled Point Names in a different addresses: Narayangonj, Norayanganj, Nuraiyagonj Right Spelled Point Name in my list: Narayanganj

How should I code it? - If the words of the name of the Point names name match closely or similarly, then it will pick up the estimated or appropriate Point names according to the addresses.

[1] https://i.stack.imgur.com/c9lUX.jpg

We divide the tasks into two parts. The first is to choose the word you want to correct. The second is to replace this word.

You can skip the first part and check every word, or use NER models (known models CoreNLP, Spacy, Stanza) to determine which word you need.

The answer to the second part can be found here. How to find the most similar word in a list in python

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM