简体   繁体   中英

Oracle indexes. "DISTINCT_KEYS" vs "NUM_ROWS". Do I need an NONUNIQUE index?

I have a table in which I have a lot of indexes. I noticed that in on one of them "DISTINCT_KEYS" is almost the same as "NUM_ROWS". Is such an index needed?

Or maybe it is better to remove it because:

  1. takes a place on the database.
  2. When adding data to a table, it does not necessarily slow down the refreshing of indexes.

What do you think? Will deleting this index slow down the queries using the name of this column?

索引

Is such an index needed?

All you can tell from statistics like DISTINCT_KEYS and NUM_ROWS (and other statistics like histograms) is whether an index might be useful. An index is only truly "needed" if it is actually being used by queries in your system. (See ALTER INDEX ... MONITORING USAGE command)

An index having DISTINCT_KEYS that is almost equal to NUM_ROWS certainly might be useful. In fact, it would be much more natural to suspect an index to be useless if DISTINCT_KEYS was a very low percentage of NUM_ROWS .

Suppose you have a query:

SELECT column_x
FROM   table_y
WHERE  column_z = :some_value

Suppose the index on column_z shows DISTINCT_KEYS = 999999 and NUM_ROWS = 1000000.

That means, on average, each distinct key has (very) slightly more than one row. That makes the index very selective and very useful. When our query runs, we will use the index to pull out only one row of the table very quickly.

Suppose, instead, the index on column_z shows DISTINCT_KEYS = 2 and NUM_ROWS = 1000000. Now, each distinct key has an average of 500,000 rows. This index is worthless because we have to read each half of the blocks from the index and then still probably wind up reading at last half of the blocks from the table (probably way more than half). Worse, these reads are all single block reads. It would be way, way faster for Oracle to ignore the index and do a full table scan -- fewer blocks in total to read and all the reads are multi-block reads (eg, 8 at a time).

For completeness, I'll point out that an index with DISTINCT_KEYS = 2 and NUM_ROWS = 1000000 could still be useful if the data is very skewed. That is, for example, if one distinct key had 999,000 rows and the other distinct key had only 1,000 rows. The index would be useful for finding the rows of that other (smaller) distinct key. Oracle gathers histograms as part of its statistics to keep track of which columns have skewed data and, if so, how many rows there are for each distinct key. (Over-simplification).

TL;DR It's very likely a good index and no more likely to be "unneeded" than any other index in your system.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM