简体   繁体   中英

Select all rows where the column value is the "same" in MySQL?

So I have a table which stores a URL in a column. Due to URL's being parsed and written differently, there are duplicates in the table. How can I select all rows that have the same domain and path of the URL?

I can select duplicates where the URL is an exact match, but that is not what I want.

Examples,

# This is a duplicate
https://www.example.com/example1
https://example.com/example1

# Not a duplicate
https://example.com/example2
https://example.com/example3

# This is a duplicate
https://example.com/example2/
https://example.com/example2

You can remove the duplicate value using this.

DELETE t1 FROM urls t1
INNER JOIN urls t2 
WHERE 
    t1.id != t2.id AND 
    t1.url = TRIM(TRAILING '/' FROM REPLACE(t2.url, '://www.', '://'));

This is the example url: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=792b0a7870b1abdd91f13cd4c608ab6a

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM