I am using DB2 and am trying to count duplicate rows in a table called ML_MEASURE. What I define as a duplicate in this table, is a row containing the same DATETIME and TAG_NAME value. So I tried this below:
SELECT
DATETIME,
TAG_NAME,
COUNT(*) AS DUPLICATES
FROM
ML_MEASURE
GROUP BY DATETIME, TAG_NAME
HAVING COUNT(*) > 1
The query doesn't fail, but I get an empty result, even though I now for a fact I have at least one duplicate, when I tried this query below I got the result correct for this specific tag_name and datetime:
SELECT
DATETIME,
TAG_NAME,
COUNT(*) AS DUPLICATES
FROM
ML_MEASURE
WHERE
DATETIME='2018-03-23 15:09:30' AND
TAG_NAME='HOG.613KU201'
GROUP BY
DATETIME,
TAG_NAME.
The result of the second query looked like this:
DATETIME TAG_NAME DUPLICATES
--------------------- ------------ ----------
2018-03-23 15:09:30.0 HOG.613KU201 3
What am I doing wrong in the first query?
* UPDATE *
My table is row organized, not sure if that makes any difference.
Yes, you should get the same row back on the first query. If you had a NOT ENFORCED TRUSTED
Primary Key or Unique constraint on those two columns, then the Optimizer would be within it's rights to trust the constraint and return you no rows. However from a quick test, I don't believe it does that for this query. Do you have any indexes defined on the table?
(PS I assume you are not running the query from a shell prompt and redirecting the output to a file of the name 1
)
This worked for me:
SELECT * FROM (
SELECT DATETIME, TAG_NAME, COUNT(*) AS DUPLICATES
FROM ML_MEASURE
GROUP BY DATETIME, TAG_NAME
) WHERE DUPLICATES > 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.