简体   繁体   English

oracle提高了查询性能

[英]oracle improve query performance

I'm new to oracle and I have to fight against this problem. 我是oracle的新手,我不得不反对这个问题。

I have a table with approximately 520 millions of rows inside. 我有一张桌子,里面有大约520万行。 I have to fetch all rows and import them (denormalizing) inside a NoSQL db. 我必须获取所有行并在NoSQL数据库中导入它们(非规范化)。

The table has two integer fields C_ID and A_ID and 3 indexes, one over C_ID, one over A_ID and one on both fields. 该表有两个整数字段C_ID和A_ID以及3个索引,一个在C_ID上,一个在A_ID上,一个在两个字段上。

I've tried this way at the beginning: 我一开始就尝试过这种方式:

SELECT C_ID, A_ID FROM M_TABLE;

and this has never given to me any result in reasonable time (I had no possibility to measure the time because it seemed to never complete). 这从来没有给我任何合理时间的结果(我没有可能测量时间,因为它似乎永远不会完成)。

I changed the query in this way: 我用这种方式更改了查询:

SELECT /*+ ALL_ROWS */ C_ID, A_ID FROM (SELECT
    rownum rn, C_ID, A_ID
FROM
    M_TABLE WHERE rownum < ((:1 * :2 ) +1 )) WHERE rn >= (((:1 -1) * :2 ) +1 );

I run this query in parallel using 3 threads and paginating using pages with size 1000. 我使用3个线程并行运行此查询,并使用大小为1000的页面进行分页。

I tried to introduce three optimization: 我试着介绍三种优化:

1) I created statistics over the table: 1)我在表格上创建了​​统计数据:

ANALYZE TABLE TABLE_M ESTIMATE STATISTICS SAMPLE 5 PERCENT;

2) I partitioned the table in 8 partitions. 2)我在8个分区中对表进行了分区。

3) I created the table with parallel option. 3)我用并行选项创建了表。

Now I am able to fetch 10000 rows per second and so the whole process takes about 15 hours to complete (the DB is running on a 4 cores, 8 GB machine). 现在我能够每秒获取10000行,因此整个过程大约需要15个小时才能完成(数据库运行在4核,8 GB机器上)。

The problem is that I need to complete all in maximum 5 hours. 问题是我需要在最多5个小时内完成所有操作。

I am out of ideas and so, before I ask for a new machine, you know any way to improve performance in such a situation. 我没有想法,因此,在我要求新机器之前,你知道如何在这种情况下提高性能。

Oracle is pretty intelligent in telling us where it had spent its time. Oracle非常聪明地告诉我们它花了多少时间。 You can do this by tracing your session using Oracle's extended SQL trace (in other words 10046 trace). 您可以通过使用Oracle的扩展SQL跟踪(换言之,10046跟踪)跟踪会话来执行此操作。 Your query is extracting data from one table that has lot of data. 您的查询是从一个包含大量数据的表中提取数据。 Check your IO rate (db_file_scattered_read) which is probably one of the top wait events of your query. 检查您的IO速率(db_file_scattered_read),这可能是查询的最常见等待事件之一。

Hope it helps. 希望能帮助到你。

What do you do with your result? 你怎么处理你的结果? Is it fetched directly to a file with PL/SQL or do you use another application to process the data? 是直接使用PL / SQL获取文件还是使用其他应用程序处理数据? Is it sent accross the network? 它是通过网络发送的吗? (this might be the low hanging fruit). (这可能是低悬的果实)。

The reason I ask is that usually a FULL SCAN (without an ORDER BY) will return the first rows instantly . 我问的原因是通常一个FULL SCAN (没有ORDER BY)会立即返回第一行 If you're outputting the result to a file you should see it start to fill up immediately. 如果您要将结果输出到文件,您应该看到它立即开始填满。 If you do not, this means that there is a lot of empty space at the beginning of the segment, which could explain why the query never returns (in a reasonable time at least). 如果不这样做,这意味着在段的开头有很多空的空间,这可以解释为什么查询永远不会返回(至少在合理的时间内)。

So when you say that your query doesn't return I'm a bit concerned, how can you tell? 所以当你说你的查询没有返回时我有点担心,你怎么能说出来? Does the following block returns? 以下块是否返回?

DECLARE
  l NUMBER := 0;
BEGIN
  FOR cc IN (SELECT C_ID, A_ID FROM M_TABLE) LOOP
    l := l + 1;
    EXIT WHEN l >= 100000;
  END LOOP;
END;

If it does it means that your FULL SCAN is being processed. 如果是,则表示正在处理您的全扫描。 By timing the above query you should be able to calculate how much time would be needed for a complete single SCAN, assuming the segment is uniformly dense. 通过对上述查询进行计时,您应该能够计算完整单个SCAN所需的时间,假设该段是均匀密集的。

Reading 500M rows is a lot of work but the rows are tiny so if the table segment is well compacted Oracle should return all rows in a reasonable time. 读取500M行是很多工作但行很小,所以如果表段压缩得很好,Oracle应该在合理的时间内返回所有行。 Table segments can have inefficient space configuration if repeatedly deleted then loaded with INSERT /*+APPEND*/ for example. 如果重复删除,则表段可能具有低效的空间配置,然后加载INSERT /*+APPEND*/ Rebuilding the table ( ALTER TABLE MOVE ) will remove all empty useless space in the segment. 重建表( ALTER TABLE MOVE )将删除段中所有空的无用空间。 By the way when you partitioned the table you did rebuild it, so this may be the reason why your query now returns !! 顺便说一句,当您对表进行分区时,您确实重建了它,因此这可能就是您的查询现在返回的原因!

In any case I would advise you to retry the FULL TABLE SCAN, possibly after having rebuilt the table to reset any empty space and the high water mark. 在任何情况下,我都会建议您重建FULL TABLE SCAN,可能是在重建了表以重置任何空白区域和高水位标记之后。 A single FULL TABLE SCAN is by far the most reliable method (and one of the most efficient) to access lots of data. 单个FULL TABLE SCAN是迄今为止访问大量数据的最可靠方法(也是最有效的方法之一)。

If you need to further improve performance, I suggest you take a look at ROWID partitionning ( DIY parallel processing scheme) or the built-in package DBMS_PARALLEL_EXECUTE . 如果您需要进一步提高性能,我建议您看一下ROWID分区( DIY并行处理方案)或内置软件包DBMS_PARALLEL_EXECUTE

It might be a bit of a drastic solution to try but you could look at table compression. 尝试这可能是一个激烈的解决方案,但你可以看看表压缩。 In Oracle 10g this is only really useful for read-only tables since the block are uncompressed when write operations are done. 在Oracle 10g中,这对只读表非常有用,因为在写操作完成时块是未压缩的。 I've found compression to be useful for large tables in a data warehousing environment. 我发现压缩对于数据仓库环境中的大型表非常有用。

It is also possible to just compress certain partitions so it you are adding data to the end of a table that is partitioned by date, you could compress historical partitions while leaving the most recent one uncompressed. 也可以只压缩某些分区,以便将数据添加到按日期分区的表的末尾,您可以压缩历史分区,同时保留最近的分区未压缩。

The advantage of table compression is that it reduces the amount of I/O required which could help on an I/O constrained system. 表压缩的优点是它减少了可能有助于I / O约束系统所需的I / O量。 I was often getting 10:1 compression out of tables although it depends on what is stored in the table and the sorting used when inserting data. 我经常从表中获得10:1的压缩,尽管它取决于表中存储的内容以及插入数据时使用的排序。

For an existing table I think you can use the command: 对于现有的表,我认为您可以使用以下命令:

ALTER TABLE M_TABLE COMPRESS MOVE;

Note that this may help to solve your problem but changing the underlying structure of the tables might be a little drastic. 请注意,这可能有助于解决您的问题,但更改表的基础结构可能会有点激烈。 Also, rebuilding the table as compressed can invalidate some of the indexes. 此外,将表重建为压缩可能会使某些索引无效。

Under Oracle 11g you can also you advanced compression which allows updates to the data but this involves expensive licensing costs. 在Oracle 11g下,您还可以使用高级压缩技术来更新数据,但这会导致昂贵的许可成本。

There is some documentation here and a lot more information in this PDF document 有一些文件在这里和更多的信息,在这个PDF文档

Yes as said by user2033072 you should use SQL Trace and TkProf to know a bit more about the query. 是的,如user2033072所述,您应该使用SQL TraceTkProf来了解有关查询的更多信息。 You could see the official documentation . 你可以看到官方文档

Also, more simply you could use explain plan , that way Oracle will show what it is planning to do. 此外,更简单地说,您可以使用explain plan ,这样Oracle就会显示它计划执行的操作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM