I have a column that may contain entries like this: abc.yahoo.com efg.yshoo.com hij.yahoo.com
I need to delete all the duplicates and LEAVE ONE ONLY as I don't need the others. Such command can be easily done if I know the second part (ex: yahoo.com) but my problem is that the part (yahoo.com) is not fixed. I may have entries such as: abc.msn.com efg.msn.com hij.msn.com
And I want to treat all these cases at once. Is this possible?
This is assuming that you just want to take out the letters before the first .
then group on the column:
DELETE a FROM tbl a
LEFT JOIN
(
SELECT MIN(id) AS id
FROM tbl
GROUP BY SUBSTRING(column, LOCATE('.', column))
) b ON a.id = b.id
WHERE b.id IS NULL
Where id
is your primary key column name, and column
is the column that contains the values to group on.
This will also account for domains like xxx.co.uk
where you have two parts at the end.
Make sure you have a backup of your current data or run this operation within a transaction (where you can ROLLBACK;
if it didn't fit your needs).
EDIT : If after deleting the duplicates you want to replace the letters before the first .
with *
, you can simply use:
UPDATE tbl
SET column = CONCAT('*', SUBSTRING(column, LOCATE('.', column)))
To delete the duplicates you can use
DELETE FROM your_table t1
LEFT JOIN
(
SELECT MIN(id) AS id
FROM your_table
GROUP BY SUBSTRING_INDEX(REVERSE(col), '.', 2)
) t2 ON t2.id = t1.id
WHERE b.id IS NULL
If you need to create an UNIQUE constraint for that you can do the following
1.Add another field to hold the domain value
ALTER TABLE your_table ADD COLUMN `domain` VARCHAR(100) NOT NULL DEFAULT '';
2.Update it with the correct values
UPDATE your_table set domain = REVERSE(SUBSTRING_INDEX(REVERSE(col), '.', 2));
3.Add the unique constraint
ALTER IGNORE TABLE your_table ADD UNIQUE domain (domain);
4.Add before insert and before update trggers to set the domain column
DELIMITER $$
CREATE TRIGGER `your_trigger` BEFORE INSERT ON `your_table ` FOR EACH ROW
BEGIN
set new.domain = REVERSE(SUBSTRING_INDEX(REVERSE(new.col1), '.', 2));
END$$
CREATE TRIGGER `your_trigger` BEFORE UPDATE ON `your_table ` FOR EACH ROW
BEGIN
set new.domain = REVERSE(SUBSTRING_INDEX(REVERSE(new.col1), '.', 2));
END$$
DELIMITER ;
Note: this assumes the domain is the last 2 words when separated by '.', it will not work for a domain such as ebay.co.uk . For that you will probably need to make a stored function which returns the domain for a given host and use it instead of REVERSE(SUBSTRING_INDEX...
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.