简体   繁体   中英

Merging 2 tables that were partitioned together via “UNION ALL”

I have a common pattern in the current database that I would like to rip out. I have 3 objects where a single will suffice: current_table , history_table , combined_view .

current_table and history_table have exactly the same columns and contain data split on a timestamp, that is history_table contains data up to 2010-01-01 and current_table includes data since, including 2010-01-01 etc.

The combined view is (poor man's partitioning)

select * from history_table
UNION ALL
select * from current_table

I would like to have a single table with the same name as the view and go away with the history_table and the view. My algorithm is:

  1. Drop constraints on cutoff time.
  2. Move data from history_table into current_table
  3. Rename history_table to history_table_DEPR, rename view to combined_view_DEPR, rename current_table to combined_view

I currently achieve (2) above via the following SQL:

INSERT INTO current_table
SELECT * FROM history_table

I imagine (2) is where the bulk of the time is spent. I am worried that the insert above will attempt to write a log for each row inserted and will be slower than it could be. What is the best way to move the data in this case? I do not care about logging these moves.

This will batch

 select 1 
 while (@@rowcount > 0)
 begin
    INSERT INTO current_table ct
    SELECT top (100000) * FROM history_table ht
    where not exists ( select 1 from current_table ctt 
                       where ctt.PK = ht.PK 
                     )
 end

I wouldn't move the data at all, especially if you're going to have repeat this exercise. Use some partitioning tricks to shuffle metadata around.

1) Create an intermediate staging table with two partitions based on your separation date. 2) Create your eventual target table, named after your view, without partitions. 3) Switch the data from the existing tables into the partitioned table. 4) Collapse the two partitions into one partition. 5) Switch the remaining partition into your new target table. 6) Drop all the working objects. 7) Repeat as needed.

-- Step 0.
-- Standard issue pre-cleaning.
IF OBJECT_ID('dbo.OldData','U') IS NOT NULL
  DROP TABLE dbo.OldData;

IF OBJECT_ID('dbo.NewData','U') IS NOT NULL
  DROP TABLE dbo.NewData;

IF OBJECT_ID('dbo.CleanUp','U') IS NOT NULL
  DROP TABLE dbo.CleanUp;

IF OBJECT_ID('dbo.AllData','U') IS NOT NULL
  DROP TABLE dbo.AllData;

IF EXISTS (SELECT * FROM sys.partition_schemes
    WHERE name = 'psCleanUp')
  DROP PARTITION SCHEME psCleanUp;

IF EXISTS (SELECT * FROM sys.partition_functions  
    WHERE name = 'pfCleanUp') 
  DROP PARTITION FUNCTION pfCleanUp;

-- Mock up your existing situation. Two data tables.

CREATE TABLE dbo.OldData
(
  [Dates] DATE NOT NULL
  ,[OtherStuff] VARCHAR(1) NULL
);

CREATE TABLE dbo.NewData
(
  [Dates] DATE NOT NULL
  ,[OtherStuff] VARCHAR(1) NULL
);

INSERT INTO dbo.OldData
  (
    Dates
   ,OtherStuff
  )
VALUES
  (
    '20090101' -- Dates - date
   ,''        -- OtherStuff - varchar(1)
  );

INSERT INTO dbo.NewData
  (
    Dates
   ,OtherStuff
  )
VALUES
  (
    '20110101' -- Dates - date
   ,''        -- OtherStuff - varchar(1)
  )


-- Step .5
-- Here's where the solution starts.
-- Add check contraints to your existing tables.
-- The partition switch will require this to be sure
-- the incoming data works with the partition scheme.

ALTER TABLE dbo.OldData
ADD CONSTRAINT ckOld CHECK (Dates < '2010-01-01');

ALTER TABLE dbo.NewData
ADD CONSTRAINT ckNew CHECK (Dates >= '2010-01-01');

-- Step 1.
-- Create your partitioning artifacts and 
-- intermediate table. 

CREATE PARTITION FUNCTION pfCleanUp (DATE)
AS RANGE RIGHT FOR VALUES ('2010-01-01');

CREATE PARTITION SCHEME psCleanUp
AS PARTITION pfCleanUp
ALL TO ([PRIMARY]);

CREATE TABLE dbo.CleanUp
(
  [Dates] DATE NOT NULL
  ,[OtherStuff] VARCHAR(1) NULL
) ON psCleanUp(Dates);

-- Step 2.
-- Create your new target table.

CREATE TABLE dbo.AllData
(
  [Dates] DATE NOT NULL
  ,[OtherStuff] VARCHAR(1) NULL
);

-- Step 3.
-- Start flopping metadata around.

ALTER TABLE dbo.OldData 
SWITCH TO dbo.CleanUp PARTITION 1;

ALTER TABLE dbo.NewData
SWITCH TO dbo.CleanUp PARTITION 2;

-- Step 4.
-- Your old tables should be empty now.
-- Put all of the data into one partition.

ALTER PARTITION FUNCTION pfCleanUp()
MERGE RANGE ('2010-01-01');

-- Step 5.
-- Switch that partition out to your
-- spanky new table.

ALTER TABLE dbo.CleanUp 
SWITCH PARTITION 1 TO dbo.AllData;

-- Verify the data's where it belongs.

SELECT *
FROM dbo.AllData;

-- Verify the data's not where it shouldn't be.

SELECT * FROM dbo.OldData;
SELECT * FROM dbo.NewData;
SELECT * FROM dbo.CleanUp ;

-- Step 6.
-- Clean up after yourself.

DROP TABLE dbo.OldData;
DROP TABLE dbo.NewData;
DROP TABLE dbo.CleanUp;
DROP PARTITION SCHEME psCleanUp;
DROP PARTITION FUNCTION pfCleanUp;
-- This one's just here for me.
DROP TABLE dbo.AllData;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM