简体   繁体   中英

Update a column value for 500 million rows in Interval Partitioned table

we've a table with 10 Billion rows. This table is Interval Partitioned on date . In a subpartition we need to update the date for 500 million rows that matches the criteria to a new value. This will definetly affect creation of new partition or something because the table is partitioned on the same date. Could anyone give me pointers to a best approach to follow?

Thanks in advance!

If you are going to update partitioning key and the source rows are in a single (sub)partition, then the reasonable approach would be to:

  1. Create a temporary table for the updated rows. If possible, perform the update on the fly

     CREATE TABLE updated_rows AS SELECT add_months(partition_key, 1), other_columns... FROM original_table PARITION (xxx) WHERE ...; 
  2. Drop original (sub)partition

     ALTER TABLE original_table DROP PARTITION xxx; 
  3. Reinsert the updated rows back

     INSERT /*+append*/ INTO original_table SELECT * FROM updated_rows; 

In case you have issues with CTAS or INSERT INTO SELECT for 500M rows, consider partitioning the temporary table and moving the data in batches.

hmmm... If you have enough space i would create a "copy" of the source table with the good updated rows, then check the results and drop the source table after it, in the end rename the "copy" to the source. Yes this have a long executing time, but this could be a painless way, of course parallel hint is needed.

You may consider to add a new column (Flag) 'updated' bit that have by fedault the values NULL (Or 0, i preffer NULL) to your table, and using the criticias of dates that you need to update you can update data group by group in the same way described by Kombajn, once the group of data is updated you can affect the value 1 to the flag 'updated' to your group of data.

For exemple lets start by making groups of datas, let consider that the critecia of groups is the year. so lets start to treate data year by year.

  1. Create a temporary table of year 1 :

CREATE TABLE updated_rows AS SELECT columns... FROM original_table PARITION (2001) WHERE YEAR = 2001 ...;

2.Drop original (sub)partition

ALTER TABLE original_table DROP PARTITION 2001;

3.Reinsert the updated rows back

INSERT /*+append*/ INTO original_table(columns....,updated) SELECT columns...,1 FROM updated_rows;

Hope this will helps you to treat data step by step to prevent waiting all data of the table to be updated in once. You may consider a cursor that loop over years.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM