简体   繁体   English

SQL 基于多个条件删除行

[英]SQL Delete Rows Based on Multiple Criteria

I am trying to delete rows from a data set based on multiple criteria, but I am receiving a syntax error.我正在尝试根据多个条件从数据集中删除行,但收到​​语法错误。 Here is the current code:这是当前的代码:

With cte As (
        Select *, 
                Row_Number() Over(Partition By ID, Numb1 Order by ID) as RowNumb
        from DataSet
)
Delete from cte Where RowNumb > 1;

Where DataSet looks like this: DataSet 如下所示:

在此处输入图片说明

I want to delete all records in which the ID and the Numb1 are the same.我想删除所有 ID 和 Numb1 相同的记录。 So I would expect the code to delete all rows except:所以我希望代码删除所有行,除了:

在此处输入图片说明

WITH Clauses in Vertica only support SELECT or INSERT, not DELETE/UPDATE. Vertica 中的 WITH 子句仅支持 SELECT 或 INSERT,不支持 DELETE/UPDATE。

Vertica Documentation Vertica 文档

The cte is a temporary table. cte 是一个临时表。 You cannot delete from it.您无法从中删除。 It is effectively read-only.它实际上是只读的。

If you are trying to delete duplicates out of the original DataSet table, you have to delete from the DataSet, not from the cte table.如果您尝试从原始 DataSet 表中删除重复项,则必须从 DataSet 中删除,而不是从 cte 表中删除。

Try this:尝试这个:

 with cte as ( select ID, Row_Number() Over(Partition By ID, Numb1 Order by ID) as RowNumb from DataSet ) delete from DataSet where ID in (select ID from cte where RowNumb > 1)

Can't delete from CTEs.无法从 CTE 中删除。 Just manually use delete syntax but rollback transactions or if you have permissions you can always replicate it and test.只需手动使用删除语法但回滚事务,或者如果您有权限,您可以随时复制它并进行测试。

I am not very experienced with Vertica but it seems like it is not very flexible about delete statements.我对 Vertica 不是很有经验,但似乎它对delete语句不是很灵活。

One way to do it would be to use a temporary table to store the rows that you want to keep, then truncate the original the table, and insert back into it from the temp table:一种方法是使用临时表来存储要保留的行,然后截断原始表,然后从临时表中插入:

create temporary table MyTempTable as
select id, numb1, state_coding
from (select t.*, count(*) over(partition by id, numb1) cnt from DataSet) as t
where cnt = 1;

truncate table DataSet;

insert into DataSet
select id, numb1, state_coding from MyTempTable;

Note that I used a window count instead of row_number .请注意,我使用了窗口计数而不是row_number This will remove records for which at least another record exists with the same id and numb1 , which is what I understand that you want from your sample data and expected results.这将删除至少存在另一个具有相同idnumb1记录的记录,这就是我了解您希望从您的示例数据和预期结果中获得的结果。

Important: make sure to backup your entire table before you do this!重要提示:确保在执行此操作之前备份整个表!

You'd have saved me ~5 min had you pasted the data as text and not as picture - as I could not copy-paste and had to retype ...如果您将数据粘贴为文本而不是图片,您会为我节省大约 5 分钟的时间 - 因为我无法复制粘贴并且不得不重新输入...

Having said that:话说回来:

Rebuild the table here:在此处重建表:

DROP TABLE IF EXISTS input;
CREATE TABLE input(id,numb1,state_coding) AS (
          SELECT 202003,4718868,'D'
UNION ALL SELECT 202003,  35756,'AA'
UNION ALL SELECT 204281, 146199,'D'
UNION ALL SELECT 204281, 146199,'D'
UNION ALL SELECT 204346, 108094,'D'
UNION ALL SELECT 204346, 108094,'D'
UNION ALL SELECT 204389,  14642,'DD'
UNION ALL SELECT 204389,  96504,'F'
UNION ALL SELECT 204392,  22010,'D'
UNION ALL SELECT 204392,   8051,'G'
UNION ALL SELECT 204400,  74118,'D'
UNION ALL SELECT 204400, 103900,'D'
UNION ALL SELECT 204406,1387304,'D'
UNION ALL SELECT 204406,      0,'HJ'
UNION ALL SELECT 204516,    894,'D'
UNION ALL SELECT 204516,   3927,'D'
UNION ALL SELECT 204586, 234235,'D'
UNION ALL SELECT 204586, 234235,'D'
)
;

And then: Based on what was said in other responses, and keeping in mind that a mass delete of an important part of the table, not only in Vertica, is best implemented as an INSERT ... SELECT with inverted WHERE condition - here goes:然后:根据其他回复中所说的内容,并记住,不仅在 Vertica 中,批量删除表的重要部分,最好将其实现为 INSERT ... SELECT 与反向 WHERE 条件 - 在这里:

CREATE TABLE input_help AS
SELECT * FROM input
GROUP BY id,numb1,state_coding
HAVING COUNT(*) = 1;

DROP TABLE input;
ALTER TABLE input_help RENAME TO input;

At least, it works with that simplicity if the whole row is the same - I notice you don't put state_coding into the condition yourself.至少,如果整行都相同,它会以这种简单的方式工作 - 我注意到您没有自己将 state_coding 放入条件中。 Otherwise, it gets slightly more complicated.否则,它会变得稍微复杂一些。

Or did you want to re-insert one row of the duplicates each afterwards?或者您是否想在之后重新插入一行重复项?

Then, just build input_help as SELECT DISTINCT * FROM input;然后,只需将input_help构建为SELECT DISTINCT * FROM input; , then drop, then rename. ,然后删除,然后重命名。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM