简体   繁体   中英

Search SQL for Similar Strings of Text

I've browsed some of the questions on Stack Overflow, but can't seem to find an answer. I have imported a really large database with customer information (approximately 6 million entries) into MySQL database. I'm using PHP to query the database. The data has not been entered in a computer friendly way. When a customer checks their details, I need to also query the database for anyone else who has the exact same physical address and inform the user.

The problem is that the same address has been entered in all kinds of ways, for example,

105 Ocean Avenue

105 Ocean Ave.

There are also additional spaces between commas in some addresses or double spaces, for example:

105 Ocean Avenue, New York

105 Ocean Avenue , New York

This makes the equals = operator useless... Is there an easy way to query the database to find similarities that are (for example) 80% similar and above.

Full text search is path for you to move forward.

Your queries will be like below,

SELECT * FROM table_name WHERE MATCH(col1, col2) AGAINST('search terms' IN BOOLEAN MODE)

GO through following documentation, it should serve the purpose.

https://www.w3resource.com/mysql/mysql-full-text-search-functions.php

http://www.mysqltutorial.org/mysql-full-text-search.aspx

You can make the comparison from Php. For example use the Php similar_text or the levenshtein functions. Both functions provide a measure of similarity between two strings.

Alternately you can use the Mysql Natural language full text search.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM