简体   繁体   English

Oracle 11g分区表上的并发统计信息收集

[英]concurrent statistics gathering on Oracle 11g partiitioned table

I am developing a DWH on Oracle 11g. 我正在Oracle 11g上开发DWH。 We have some big tables (250+ million rows), partitioned by value. 我们有一些大表(250+百万行),按值划分。 Each partition is a assigned to a different feeding source, and every partition is independent from others, so they can be loaded and processed concurrently. 每个分区都分配给不同的馈送源,并且每个分区彼此独立,因此可以同时加载和处理它们。

Data distribution is very uneven, we have partition with millions rows, and partitions with not more than a hundred rows, but I didn't choose the partitioning scheme, and by the way I can't change it. 数据分配非常不均匀,我们的分区有几百万行,分区有不超过一百行,但是我没有选择分区方案,而且我不能更改它。

Considered the data volume, we must assure that every partition has always up-to-date statistics, because if the subsequent elaborations don't have an optimal access to the data, they will last forever. 考虑到数据量,我们必须确保每个分区始终具有最新的统计信息,因为如果后续的详细说明没有对数据的最佳访问,它们将永远存在。

So for each concurrent ETL thread, we 因此,对于每个并发的ETL线程,我们

  1. Truncate the partition 截断分区
  2. Load data from staging area with 从暂存区加载数据

SELECT /*+ APPEND */ INTO big_table PARTITION(part1) FROM temp_table WHERE partition_colum = PART1

(this way we have direct path and we don't lock the whole table) (这样,我们就拥有直接路径,并且我们不会锁定整个表)

  1. We gather statistics for the modified partition. 我们收集修改分区的统计信息。

In the first stage of the project, we used the APPROX_GLOBAL_AND_PARTITION strategy and worked like a charm 在项目的第一阶段,我们使用了APPROX_GLOBAL_AND_PARTITION策略,并且像魅力一样

 dbms_stats.gather_table_stats(ownname=>myschema,
                              tabname=>big_table,
                              partname=>part1,
                              estimate_percent=>1,
                              granularity=>'APPROX_GLOBAL_AND_PARTITION',
                              CASCADE=>dbms_stats.auto_cascade,
                              degree=>dbms_stats.auto_degree) 

But, we had the drawback that, when we loaded a small partition, the APPROX_GLOBAL part was dominant (still a lot faster than GLOBAL) , and for a small partition we had, eg, 10 seconds of loading, and 20 minutes of statistics. 但是,我们的缺点是,当我们加载一个小分区时,APPROX_GLOBAL部分占主导地位(仍然比GLOBAL快很多),而对于一个小分区,我们拥有例如10秒的加载时间和20分钟的统计信息。

So we have been suggested to switch to the INCREMENTAL STATS feature of 11g, which means that you don't specify the partition you have modified, you leave all parameters in auto, and Oracle does it's magic, automatically understanding which partition(s) have been touched. 因此,建议我们切换到11g的INCREMENTAL STATS功能,这意味着您无需指定已修改的分区,而是将所有参数保留在auto中,Oracle做到了这一点,自动了解哪些分区具有被感动了。 And it actually works, we have really speeded up the small partition. 它确实有效,我们确实加快了小型分区的速度。 After turning on the feature, the call became 启用该功能后,通话变为

 dbms_stats.gather_table_stats(ownname=>myschema,
                              tabname=>big_table,
                              estimate_percent=>dbms_stats.auto_sample_size,
                              granularity=>'AUTO',
                              CASCADE=>dbms_stats.auto_cascade,
                              degree=>dbms_stats.auto_degree) 

notice, that you don't pass the partition anymore, and you don't specify a sample percent. 注意,您不再传递分区,也没有指定样本百分比。

But, we're having a drawback, maybe even worse that the previous one, and this is correlated with the high level of parallelism we have. 但是,我们有一个缺点,甚至可能比上一个缺点更糟,这与我们拥有的高并行度有关。

Let's say we have 2 big partition that starts at the same time, they will finish the load phase almost at the same time too. 假设我们有两个同时启动的大分区,它们也将几乎同时完成加载阶段。

  1. The first thread ends the insert statement, commits, and launches the stats gathering. 第一个线程结束插入语句,提交并启动统计信息收集。 The stats procedure notices there are 2 partition modified (this is correct, one is full and the second is truncated, with a transaction in progress), updates correctly the stats for both the partitions. stats过程注意到有2个分区已修改(这是正确的,一个分区已满,第二个被截断,正在进行事务),正确更新了两个分区的统计信息。

  2. Eventually the second partition ends, gather the stats, it see all partition already updated, and does nothing (this is NOT correct, because the second thread committed the data in the meanwhile). 最终第二个分区结束,收集统计信息,它看到所有分区已更新,并且什么也不做(这是不正确的,因为第二个线程同时提交了数据)。

The result is: 结果是:

PARTITION NAME | LAST ANALYZED        | NUM ROWS | BLOCKS | SAMPLE SIZE
-----------------------------------------------------------------------
PART1          | 04-MAR-2015 15:40:42 | 805731   | 20314  | 805731
PART2          | 04-MAR-2015 15:41:48 | 0        | 16234  | (null)

and the consequence is that I occasionally incur in not optimal plans (which mean killing the session, refresh manually the stats, manually launch the precess again). 结果是我偶尔会遇到不理想的计划(这意味着终止会话,手动刷新统计信息,再次手动启动进程)。

I tried even putting an exclusive lock on the gathering, so no more than one thread can gather stats on the same table at once, but nothing changed. 我什至尝试在收集上设置排他锁,因此最多可以有一个线程一次在同一张表上收集统计信息,但是没有任何变化。

IMHO this is an odd behaviour, because the stats procedure, the second time it is invoked, should check for the last commit on the second partition, and should see it's newer than the last stats gathering time. 恕我直言,这是一个奇怪的行为,因为统计信息过程在第二次被调用时,应该检查第二个分区上的最后一次提交,并且应该比上次统计信息收集时间更新。 But seems it's not happening. 但是似乎没有发生。

Am I doing something wrong? 难道我做错了什么? Is it an Oracle bug? 它是Oracle错误吗? How can I guarantee that all stats are always up-to-date with incremental stats feature turned on, and an high level of concurrency? 我如何保证所有统计信息始终处于最新状态,并且启用了增量统计信息功能并具有高并发性?

I managed to reach a decent compromise with this function. 我设法在此功能上达成了妥协。

PROCEDURE gather_tb_partiz(
    p_tblname IN VARCHAR2,
    p_partname IN VARCHAR2)
IS
  v_stale all_tab_statistics.stale_stats%TYPE;
BEGIN
  BEGIN
    SELECT stale_stats
    INTO v_stale
    FROM user_tab_statistics
    WHERE table_name = p_tblname
    AND object_type = 'TABLE';
  EXCEPTION
  WHEN NO_DATA_FOUND THEN
    v_stale := 'YES';
  END;
  IF v_stale = 'YES' THEN
    dbms_stats.gather_table_stats(ownname=>myschema, 
                                  tabname=> p_tblname,
                                  partname=>p_partname,
                                  degree=>dbms_stats.auto_degree,
                                  granularity=>'APPROX_GLOBAL AND PARTITION') ;
  ELSE
    dbms_stats.gather_table_stats(ownname=>myschema,
                                 tabname=>p_tblname,
                                 partname=>p_partname,
                                 degree=>dbms_stats.auto_degree,
                                 granularity=>'PARTITION') ;
  END IF;
END gather_tb_partiz;

At the end of each ETL, if the number of added/deleted/modified rows is low enough not to mark the table as stale (10% by default, can be tuned with STALE_PERCENT parameter), I collect only partition statistics; 在每个ETL的末尾,如果添加/删除/修改的行数足够少而不会将表标记为陈旧(默认情况下为10%,可以使用STALE_PERCENT参数进行调整),则仅收集分区统计信息; otherwise i collect global and partition statistics. 否则,我收集全局和分区统计信息。

This keeps ETL of small partition fast, because no global partition must be regathered, and big partition safe, because any subsequent query will have fresh statistics and will likely use an optimal plan. 这使小分区的ETL保持快速,这是因为不必重新全局分区,而大分区是安全的,因为任何后续查询都将具有新的统计信息,并且可能会使用最佳计划。

Incremental stats is anyway enabled, so whenever the global has to be recalculated, it is pretty fast because aggregates partition level statistics and does not perform a full scan. 无论如何,增量统计都是启用的,因此每当必须重新计算全局时,它都非常快,因为它会汇总分区级别的统计信息,并且不会执行全面扫描。

I am not sure if, with incremental enabled, "APPROX_GLOBAL AND PARTITION" and "GLOBAL AND PARTITION" do differ in something, because both incremental and approx do basically the same thing: aggregate stats and histograms without doing a full scan. 我不确定启用增量后,“ APPROX_GLOBAL AND PARTITION”和“ GLOBAL AND PARTITION”是否确实有所不同,因为增量和近似值基本上是相同的:汇总统计信息和直方图而不进行全面扫描。

Have you tried to have incremental statistics on, but still explicitly name a partition to analyze? 您是否尝试过进行增量统计,但仍明确命名要分析的分区?

 dbms_stats.gather_table_stats(ownname=>myschema,
                              tabname=>big_table,
                              partname=>part,
                              degree=>dbms_stats.auto_degree);

For your table, stale (yesterday's) global stats are not as harmful as completely invalid partition stats (0 rows). 对于您的表,陈旧(昨天)的全局统计信息不如完全无效的分区统计信息(0行)有害。 I can propose 2 a bit alternative approaches that we use: 我可以提出2种我们可以使用的替代方法:

  • Have a separate GLOBAL stats gathering executed by your ETL tool right after all partitions are loaded. 加载所有分区后,由ETL工具立即执行单独的GLOBAL统计信息收集。 If it's taking too long, play with estimate_percent as dbms_stats.auto_degree will likely to be more than 1% 如果花费的时间太长,请使用估算值百分比,因为dbms_stats.auto_degree可能会超过1%
  • Gather the global (as well as all other stale) stats in a separate database job run later during the day, after all data is loaded into DW. 在将所有数据加载到DW之后,在当天晚些时候运行的单独数据库作业中收集全局(以及所有其他过时的)统计信息。

The key point is that stale statistics which differ only slightly from fresh are almost just as good. 关键是,陈旧的统计数据与最新统计数据仅稍有不同,几乎一样。 If statistics show you 0 rows, they'll kill any query. 如果统计信息显示0行,它们将杀死所有查询。

Considering what you are trying to achieve, you need to run stats on specific intervals of time for all Partitions and not at the end of the process that loads each partition. 考虑到要实现的目标,您需要在所有分区的特定时间间隔运行统计信息,而不是在加载每个分区的过程结束时运行统计信息。 It could be challenging if this is a live table and has constant data loads happening round the clock, but since these are LARGE DW tables I really doubt that's the case. 如果这是一个活动表并且有不间断的数据负载,这可能会很有挑战性,但是由于这些表是大型DW表,因此我真的很怀疑。 So the best bet would be to collect stats at the end of loading all partitions, this will ensure that the statistics is collected for partitions where data has change or statistics are missing and update the global statistics based on the partition level statistics and synopsis. 因此,最好的选择是在加载所有分区结束时收集统计信息,这将确保收集数据已更改或缺少统计信息的分区的统计信息,并根据分区级别的统计信息和摘要更新全局统计信息。

However to do so, you need to turn on incremental feature for the table (11gR1). 但是,您需要打开表(11gR1)的增量功能。

EXEC DBMS_STATS.SET_TABLE_PREFS('<Owner>','BIG_TABLE','INCREMENTAL','TRUE');

At the end of every load, gather table statistics using GATHER_TABLE_STATS command. 在每次加载结束时,使用GATHER_TABLE_STATS命令收集表统计信息。 You don't need to specify the partition name. 您无需指定分区名称。 Also, do not specify the granularity parameter. 另外,请勿指定粒度参数。

EXEC DBMS_STATS.GATHER_TABLE_STATS('<Owner>','BIG_TABLE');

Kindly check if you have used DBMS_STATS to set table preference to gather incremental statistics. 请检查您是否使用DBMS_STATS设置表首选项以收集增量统计信息。 This oracle blog explains that statistics will be gathered after each row affected. 这个oracle博客解释说,将在受影响的每一行之后收集统计信息。

Incremental statistics maintenance needs to gather statistics on any partition that will change the global or table level statistics. 增量统计信息维护需要收集将更改全局或表级别统计信息的任何分区上的统计信息。 For instance, the min or max value for a column could change after just one row is inserted or updated in the table 例如,在表格中仅插入或更新一行后,列的最小值或最大值可能会更改

BEGIN 
DBMS_STATS.SET_TABLE_PREFS(myschema,'BIG_TABLE','INCREMENTAL','TRUE'); 
END;

I'm a bit rusty about it, so first of all a question: did you try serializing partition loading? 我对此有些生疏,所以首先提出一个问题:您是否尝试过序列化分区加载? If so, how long and how well does statistics run? 如果是这样,统计将运行多长时间以及运行得如何? Notice that since loading time is so much smaller than statistics gathering, i guess this could also act as a temporary workaround. 请注意,由于加载时间比统计信息收集要短得多,我想这也可以作为临时解决方法。

Append hint does affects redo size, meaning the transaction just traces something, thus statistics may not reckon new data: http://oracle-base.com/articles/misc/append-hint.php 追加提示确实会影响重做大小,这意味着事务仅跟踪某些内容,因此统计信息可能无法估算出新数据: http : //oracle-base.com/articles/misc/append-hint.php

Thinking out loud: since the direct path insert does append rows at the end of the partition and eventually updates metadata at the end, the already running thread gathering statistics could have read non-updated (stale) data. 大胆考虑:由于直接路径插入确实在分区的末尾追加了行,并最终在末尾更新了元数据,因此已经运行的线程收集统计信息可能已读取未更新的(陈旧的)数据。 Thus it may not be a bug, and locking threads would accomplish nothing. 因此,它可能不是错误,并且锁定线程将无法执行任何操作。

You may test this behaviour temporarily switching your table/partition to LOGGING, for instance, and see how it works (slower, of course, but it's a test). 例如,您可以测试此行为,以暂时将表/分区切换为LOGGING,并查看其工作方式(当然要慢一些,但这是一项测试)。 Can you do it? 你可以做到吗?

EDIT: incremental stats should work anyway, even disabling a parallel statistics gathering, since it reiles on the incremental values no matter how they were collected: https://blogs.oracle.com/optimizer/entry/incremental_statistics_maintenance_what_statistics 编辑:增量统计无论如何都应该起作用,甚至禁用并行统计收集,因为无论如何收集增量统计,它们都依赖于增量值: https ://blogs.oracle.com/optimizer/entry/incremental_statistics_maintenance_what_statistics

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM