简体   繁体   中英

MySQL: Find similar repeated entries

Given my table

id | name
 01| Test Name
 02| Name Test
 03| Another name
 ...
 ...
 nn| Test string

I'd like to do the following, for each entry, read the first word, until a space, so in this example i would read Test, then, find all similar entries which contain Test anywhere on the string, then continue with Name , Another , and so on.

I don't want to do this manually, as I'll have to make a lot of queries, the idea is the data is imported from an old excel spreadsheet and the client wants to get repeated names, so Test Name , Test Something Name and Name Test are potential similar names.

You want to run a query, select all the names, return that to php, then loop through it, parse the string into separate words, and run a query with a fulltext index.

here's something to get you started. http://www.databasejournal.com/sqletc/article.php/1578331/Using-Fulltext-Indexes-in-MySQL---Part-1.htm

Here's my db solution:

SELECT * 
FROM princess a
INNER JOIN (SELECT 
        DISTINCT CASE
            WHEN name LIKE '% %' 
                THEN SUBSTR(name, 1, LOCATE(' ', name) - 1)
            ELSE name
        END AS 'name'
    FROM princess) b ON a.name LIKE CONCAT('%', b.name ,'%')

This will find the DISTINCT names (before a space) and then JOIN to the original table using LIKE .

You may also consider using INSTR(a.name, b.name) in place of b.name LIKE CONCAT('%', a.name ,'%') depending on how your EXPLAIN looks.

There are a ton of examples of how to split a string into multiple rows in MySQL, eg this one .

After that, you can find the exact matches easily. If you want non-exact matches, check out SOUNDEX()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM