[英]find partial duplicated rows in a SQL table in IBM netezza database
This question is related to my previous question : 这个问题与我以前的问题有关:
error of finding distinct cobinations of muiltiple columns in IBM netezza SQL table 在IBM netezza SQL表中找到多个列的独特组合的错误
Now, I need to find some partial duplicated rows in the table in SQL IBM netteza Aiginity workbench. 现在,我需要找到在SQL IBM netteza Aiginity工作台表中的某些部分重复的行。
The table is like : 桌子就像:
id1 id2 **id3 id4 id5 id6** id7 id8 id9
NY 63689 eiof 394 9761 9318 2846 2319 215
NY 63689 eiof 394 9761 9318 97614 648 645
CT 39631 pfef 92169 9418 9167 164 3494 34
CT 39631 pfef 92169 9418 9167 3649 7789 568
id3 id4 id5 id6 are duplicated for id1 = NY and id2 = 63689
id3 id4 id5 id6 are duplicated for id1 = CT and id2 = 39631
The result should be 结果应该是
id1 id2 value
NY 63689 2
CT 39631 2
UPDATE : I only need to count the partial duplicated for id3 id4 id5 id6 for each id1 and id2. 更新 :我只需要为每个id1和id2计算id3 id4 id5 id5 id6的部分重复。 I do not care the columns of id7, id8, id9.
我不在乎id7,id8,id9的列。
I used the sql query: 我使用了sql查询:
SELECT id1, id2,
COUNT(*) AS value
FROM
(
SELECT
id1, id2, id3, id4, id5, id6
FROM
myTable
GROUP BY
id1, id2, id3, id4, id5, id6
)
AS uniques
GROUP BY
id1, id2
But, I got: 但是,我得到了:
id1 id2 value
NY 63689 number of combinations of id7 id8 id9
CT 39631 number of combinations of id7 id8 id9
Any help would be appreciated. 任何帮助,将不胜感激。
The following query produces the output you want. 以下查询产生所需的输出。 Is this what you want to do?
这是你想做的吗?
SELECT id1, id2, COUNT(*) AS value
FROM myTable
GROUP BY id1, id2;
EDIT: 编辑:
If you want complete duplicates (of all columns) but only to show the first two: 如果要(所有列的)完全重复但仅显示前两个:
SELECT id1, id2, COUNT(*) as value
FROM myTable
GROUP BY id1, id2, id3, id4, id5, id6;
You can add having count(*) > 1
if you only want examples with duplicates. 如果只需要带有重复项的示例,则可以添加
having count(*) > 1
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.