简体   繁体   English

提高oracle删除查询性能

[英]improve oracle delete query performance

This is my query, I plan to run it in batches of perhaps 5000 hence the rownum < 5000 这是我的查询,我计划以大约5000的批次运行它,因此rownum <5000

delete my_table
where  rownum < 5000 
and    type = 'Happy'
and    id not in 
       ( select max_id
         from   ( select max(log_id) max_id
                  ,      object_id
                  ,      type
                  from   my_table
                  where  type = 'Happy'
                  group
                  by     id
                  ,      type
                )
        )

I want to delete happy records but keeping the maximum log id, per object id 我想删除快乐记录,但保留每个对象ID的最大日志ID

I hope that makes sense. 我希望这是有道理的。

Should I be using some sort of join to improve performance? 我应该使用某种联接来提高性能吗?

I think this might run faster as a correlated subquery: 我认为这可能会以相关子查询的形式运行得更快:

Delete
    from my_table
    where type = 'Happy' and
          exists (select 1
                  from my_table t2
                  where t2.object_id = my_table.object_id and
                        t2.type = my_table.type and
                        t2.id > my_table.id
                 );

Then, an index on my_table(object_id, type, id) might also help this query. 然后, my_table(object_id, type, id)上的索引my_table(object_id, type, id)也可能有助于此查询。

Since you only care to delete ANY 5000 log entries for type = 'Happy', as long as its not the most recent for any object_id, then you can do something like this: 由于您只想删除类型='Happy'的任何5000条日志条目,只要它不是任何object_id的最新条目,那么您可以执行以下操作:

delete
from my_table
where log_id in (
    select log_id from (
        select log_id, 
            row_number() over (partition by object_id order by log_id desc) rnk
        from my_table
        where typ = 'Happy' 
        and rownum <= 5000
    )
    where rnk > 1
)

This is different from what you have because in your approach, you still need to calculate the max(id) per object across the entire table, which isn't necessary (and log tables can get very large). 这与您所拥有的不同,因为在您的方法中,您仍然需要计算整个表中每个对象的max(id),这是不必要的(日志表可能会变得非常大)。 You just need to make sure you're not deleting the "newest" row (per object) of the 5000 batch rows. 您只需要确保不删除5000个批处理行中的“最新”行(每个对象)即可。 Personally, I prefer to setup log tables using partitions, but not everyone has this option. 就个人而言,我更喜欢使用分区来设置日志表,但并非每个人都具有此选项。

Hope that helps. 希望能有所帮助。

You could simplify the query to: 您可以将查询简化为:

delete my_table
where  rownum < 5000 
and    type = 'Happy'
and    id not in (select   max(log_id) max_id
                  from     my_table
                  where    type = 'Happy'
                  group by object_id, type)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM