简体   繁体   English

建议提高查询性能(Oracle)

[英]Advise to improve Query Performance (Oracle)

I have a table (A) with 116,317,979 records and it is growing everyday by around 750,000 records per day. 我有一张表(A),其中有116,317,979条记录,并且每天在增长,大约每天750,000条记录。

As per my requirement I want to efficiently fetch last 3 complete days data from the table using a date column (date-time is stored in the column). 根据我的要求,我想使用日期列(日期时间存储在该列中)从表中有效地获取最近3天的数据。 So the query will be 所以查询将是

select * from A where date_column >= trunc(sysdate) - 3

I also need to join the table A with table B such that 我还需要将表A与表B连接起来,这样

select * from A 
left outer join B 
on A.X = B.X and A.Y = B.Y and A.Z = B.Z and B.M = 'XYZ' and B.N = 'UIM'
where A.date_column >= trunc(sysdate) - 3

Unique Index & PK of Table B (X,Y,Z,M,N) 表B的唯一索引和PK(X,Y,Z,M,N)

Unique Index & PK of Table A (ID) 表A的唯一索引和PK(ID)


Proposed IDX 1 on Table A (date_column) 表A上建议的IDX 1(date_column)

Proposed IDX 2 on Table A (X,Y,Z) 表A(X,Y,Z)上建议的IDX 2

Time without Indexes 34 sec
Time with IDX 1      32 sec
Time with IDX 1 & 2  27 sec //Sorry about the mistype

By only adding an index on A.date_column I thought I could significantly increase the performance but my tests results are negative. 通过仅在A.date_column上添加索引,我认为我可以显着提高性能,但是我的测试结果为负。 Are there any other hints with which i can increase the performance apart from adding new indexes? 除了添加新索引之外,还有其他提示可用来提高性能吗? Is there any harm in adding indexes like these in the long run. 从长远来看,添加这样的索引是否有任何危害。

OR 要么

It is better to create another table and populate last 3 days data in it somehow live(using db trigger). 最好创建另一个表,并以某种方式实时填充最后3天的数据(使用db触发器)。 I can easily have another process to purge data older than 3 days every night. 我可以轻松地通过另一个过程清除每晚3天以上的数据。

Thanks in advance. 提前致谢。

Oracle partitioning would make sense here, but this is an extra cost option even for Enterprise Edition. Oracle分区在这里很有意义,但这是一个额外的成本选择,即使对于企业版也是如此。 In case partitioning is not available - the separate table keeping last 3 days should be the best for performance. 如果无法进行分区-保留最近3天的单独表应该是性能最好的。 You should try it. 你应该试试看。

If you want to get maximum from indexes then you can consider to play a bit with the physical parameters: 如果要从索引中获取最大收益,则可以考虑使用物理参数:

  • if date column is not updated and the data is rarely deleted then you can set PCTFREE 0 如果日期列未更新且数据很少删除,则可以将PCTFREE 0设置为PCTFREE 0
  • looking at your final query, I would suggest to create an index on trunc(date) column and use compression -> in this case each index data block store much more entries. 查看您的最终查询,我建议在trunc(date)列上创建索引并使用compression->,在这种情况下,每个索引数据块都存储更多条目。 In this case the final query condition should be trunc(date_column) >= trunc(sysdate) - 3 在这种情况下,最终查询条件应为trunc(date_column) >= trunc(sysdate) - 3

Depending on the selectivity of X,Y,Z in table A, it could make sense to compress them as well. 根据表A中X,Y,Z的选择性,也可以压缩它们。 So I suggest to check two cases: 因此,我建议检查两种情况:

  1. create index trunc_date_ai on A(trunc(date_column)) pctfree 0 compress; + your IDX2 +您的IDX2
  2. create index trunc_date_ai on A(trunc(date_column),X,Y,Z) pctfree 0 compress; pctfree 0 should be used in case X,Y,Z are not updated in table A. compress keyword here makes compression for all 4 columns, so it is worth using if X,Y,Z values are highly repeatable in table A for particular trunc(date_column). pctfree 0应的情况下,X可以使用,Y,Z不更新在表A. compress关键字这里使压缩所有4列,所以它利用是值得如果X,Y,Z值在特定TRUNC表A中高度可重复(date_column)。

To force the index usage you can hint the query, for example like this: 要强制使用索引,您可以提示查询,例如:

select --+ index (A trunc_date_ai)
       * 
from   A left outer join B 
on A.X = B.X and A.Y = B.Y and A.Z = B.Z and B.M = 'XYZ' and B.N = 'UIM'
where trunc(A.date_column) >= trunc(sysdate) - 3

You should check the execution plan to see if the indexes are being used. 您应该检查执行计划,以查看是否正在使用索引。 I am guessing that the index on date_column is not used and the difference between 32 and 34 seconds is just noise. 我猜想没有使用date_column上的索引,并且32和34秒之间的区别只是噪音。

I would suggest an index on A(date_column, X, Y, Z) for this query. 我建议为此查询在A(date_column, X, Y, Z)上建立索引。

Is there harm to adding indexes? 添加索引是否有害? Well, they add overhead on insert s/ update s/ delete s. 好吧,它们增加了insert s / update s / delete的开销。 If your inserts are transactional, then you are inserting about 10 rows per second -- not counting updates and deletes. 如果插入是事务性的,则每秒插入约10行-不包括更新和删除。 If your peaks are significantly higher than that and your hardware is not very good, then the indexes could slow things down. 如果您的峰值明显高于该峰值, 并且您的硬件不是很好,那么索引可能会使速度变慢。 If the additional rows are added in batches, I wouldn't worry about the overhead. 如果将其他行分批添加,则不必担心开销。

I doubt that splitting the table out into a separate 3-day table would make much of a difference. 我怀疑将表格分成一个单独的3天表格是否会带来很大的不同。 But why listen to me? 但是为什么要听我的话? Try it out. 试试看。 Take the last 3+ days of data, dump it into a table, index it properly and see if the queries are faster. 提取最后3天以上的数据,将其转储到表中,对其进行正确索引,然后查看查询是否更快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM