[英]Delete duplicates from the table pl/sql oracle
I have a table that contains data and fields:我有一个包含数据和字段的表:
contact_id | call_siebel | start_time | operator_text | client_text | client_id | phone_num
I found duplicates:我发现重复:
SELECT operator_text, client_text, client_id
FROM TABLE
GROUP BY operator_text, client_text, client_id
HAVING COUNT(*) > 1;
For example, there are 4 identical data, two of which have the same client_id value (correct) and the other two have client_id = '-1'.例如,有 4 个相同的数据,其中两个具有相同的 client_id 值(正确),另外两个具有 client_id = '-1'。 And I need to leave only one option out of four, whose client_id is filled in correctly.
而且我只需要留下四个选项中的一个,其 client_id 填写正确。
I wanted to create a test table in which I will enter data with a unique value client_id from all duplicates that exist, remove duplicates from the main table data on client_id and in the end just insert from the test table into the main one.我想创建一个测试表,在其中我将从所有存在的重复项中输入具有唯一值 client_id 的数据,从 client_id 上的主表数据中删除重复项,最后只需从测试表插入到主表中。
How would it be more correct to insert data from the main table into the test one with a unique client_id from all duplicates?将主表中的数据插入到具有所有重复项中唯一 client_id 的测试表中会更正确吗? In my version, I incorrectly implemented GROUP BY:
在我的版本中,我错误地实现了 GROUP BY:
INSERT /*+ append enable_parallel_dml parallel(16)*/
INTO table_test
SELECT
DISTINCT
t.contact_id,
t.call_siebel,
t.start_time,
t.operator_text,
t.client_text,
t.client_id,
t.phone_num
FROM table t
WHERE t.client_id != '-1'
GROUP BY t.operator_text, t.client_text, t.client_id
HAVING COUNT(t.client_id) > 1;
After this I could easily finish my job using:在此之后,我可以使用以下方法轻松完成我的工作:
DELETE table
WHERE client_id
IN (SELECT t.client_id
FROM table_test t);
INSERT
INTO table
SELECT *
FROM table_test;
How about skipping that "temporary" table entirely?完全跳过那个“临时”表怎么样?
Sample data:样本数据:
SQL> select * from test;
CLIENT_ID OPERATOR_TEXT CLIENT_TEXT
---------- -------------------- --------------------
1 a a --> the first two rows
1 a a --> are duplicates
-1 a a --> this is invalid row (because of a negative CLIENT_ID)
1 b c --> this is OK
Delete invalid rows (because of a negative client_id
) (line #2) and all duplicates (the rowid
subquery):删除无效行(因为一个负的
client_id
)(第 2 行)和所有重复的行( rowid
子查询):
SQL> delete from test a
2 where a.client_id < 0
3 or a.rowid > (select min(b.rowid)
4 from test b
5 where b.client_id = a.client_id
6 and b.operator_text = a.operator_text
7 and b.client_text = a.client_text
8 );
2 rows deleted.
Result:结果:
SQL> select * from test;
CLIENT_ID OPERATOR_TEXT CLIENT_TEXT
---------- -------------------- --------------------
1 a a
1 b c
SQL>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.