I have a MySQL database similar to:
+----+---------+---------+------------------+....
| id | unique1 | unique2 | genaric_data |....
+----+---------+---------+------------------+....
| 0 | 100 | 1C7 | {data container} |....
+----+---------+---------+------------------+....
| 1 | 100 | 1C7 | {data container} |....
+----+---------+---------+------------------+....
| 2 | 100 | 1C8 | {data container} |....
+----+---------+---------+------------------+....
| 3 | 101 | --- | {data container} |....
+----+---------+---------+------------------+....
| 4 | 102 | 0 | {data container} |....
+----+---------+---------+------------------+....
| 5 | 103 | 1 | {data container} |....
.................................................
I need a way to add an extra column that gives the number of times all unique fields are used. I will then need to clean up the data manually.
I want a query to return:
+----+---------+---------+------+------------------+....
| id | unique1 | unique2 | dupe | genaric_data |....
+----+---------+---------+------+------------------+....
| 0 | 100 | 1C7 | 2 | {data container} |....
+----+---------+---------+------+------------------+....
| 1 | 100 | 1C7 | 2 | {data container} |....
+----+---------+---------+------+------------------+....
| 2 | 100 | 1C8 | 1 | {data container} |....
+----+---------+---------+------+------------------+....
| 3 | 101 | --- | 1 | {data container} |....
+----+---------+---------+------+------------------+....
| 4 | 102 | 0 | 1 | {data container} |....
+----+---------+---------+------+------------------+....
| 5 | 103 | 1 | 1 | {data container} |....
.......................................................
This has been a problem I have had for a while and currently my only solution is to export the data to excel and use it to find the duplicates.
Thanks.
Edit: The possible duplicate is not a solution to my problem since when I execute:
SELECT *,count(*) FROM `database`
GROUP BY `unique1`
HAVING count(*) > 1
On PhpMyAdmin(All I'm allowed access to) it merges anything with the same unique1 into one line.
The solution to your problem is to use GROUP BY:
SELECT unique1, unique2, Count(*) As colCount FROM YourTable
GROUP BY unique1, unique2
HAVING Count(*) > 1
This will return all combinations of unique1 and unique2 that occur more than once.
In a second step, you can build a query that returns all affected rows.
SELECT YourTable.*, rstDuplicates.colCount
FROM YourTable INNER JOIN (
SELECT unique1, unique2, Count(*) As colCount FROM YourTable
GROUP BY unique1, unique2
HAVING Count(*) > 1
) As rstDuplicates ON YourTable.unique1 = rstDuplicates.unique1 And YourTable.unique2 = rstDuplicates.unique2
This will output all rows that have at least one duplicate. The colCount
column shows the number of appearances.
If you want to add a field with the information, a correlated subquery is perhaps the easiest way:
select t.*,
(select count(*)
from table t2
where t2.unique1 = t.unique1 and t2.unique2 = t.unique2
) as dupecnt
from table t;
Sometimes, this is efficient (with an index on unique1, unique2
. Sometimes, it is more efficient to do the aggregation in the from
clause:
select t.*, t2.dupecnt
from table t join
(select unique1, unique2, count(*) as dupecnt
from table t2
group by unique1, unique2
) t2
on t2.unique1 = t.unique1 and t2.unique2 = t.unique1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.