简体   繁体   English

合并两个通过“ UNION ALL”分区在一起的表

[英]Merging 2 tables that were partitioned together via “UNION ALL”

I have a common pattern in the current database that I would like to rip out. 我想在当前数据库中找到一个常见的模式。 I have 3 objects where a single will suffice: current_table , history_table , combined_view . 我有3个对象,其中一个就足够了: current_tablehistory_tablecombined_view

current_table and history_table have exactly the same columns and contain data split on a timestamp, that is history_table contains data up to 2010-01-01 and current_table includes data since, including 2010-01-01 etc. current_tablehistory_table具有完全相同的列,并包含在时间戳上拆分的数据,即history_table包含截至2010-01-01的数据, current_table包含此后的数据,包括2010-01-01等。

The combined view is (poor man's partitioning) 合并的视图是(穷人的分区)

select * from history_table
UNION ALL
select * from current_table

I would like to have a single table with the same name as the view and go away with the history_table and the view. 我想有一个与视图同名的表,并与history_table和视图一起使用。 My algorithm is: 我的算法是:

  1. Drop constraints on cutoff time. 下降对截止时间的限制。
  2. Move data from history_table into current_table 将数据从history_table移动到current_table
  3. Rename history_table to history_table_DEPR, rename view to combined_view_DEPR, rename current_table to combined_view 将history_table重命名为history_table_DEPR,将视图重命名为combined_view_DEPR,将current_table重命名为combined_view

I currently achieve (2) above via the following SQL: 我目前通过以下SQL实现上述(2):

INSERT INTO current_table
SELECT * FROM history_table

I imagine (2) is where the bulk of the time is spent. 我想(2)是花费大部分时间的地方。 I am worried that the insert above will attempt to write a log for each row inserted and will be slower than it could be. 我担心上面的插入将尝试为插入的每一行写一个日志,并且会比可能慢。 What is the best way to move the data in this case? 在这种情况下,移动数据的最佳方法是什么? I do not care about logging these moves. 我不在乎记录这些动作。

This will batch 这将批量

 select 1 
 while (@@rowcount > 0)
 begin
    INSERT INTO current_table ct
    SELECT top (100000) * FROM history_table ht
    where not exists ( select 1 from current_table ctt 
                       where ctt.PK = ht.PK 
                     )
 end

I wouldn't move the data at all, especially if you're going to have repeat this exercise. 我根本不会移动数据,特别是如果您要重复此练习。 Use some partitioning tricks to shuffle metadata around. 使用一些分区技巧来随机整理元数据。

1) Create an intermediate staging table with two partitions based on your separation date. 1)根据您的分居日期创建一个具有两个分区的中间登台表。 2) Create your eventual target table, named after your view, without partitions. 2)创建最终的目标表,以您的视图命名,没有分区。 3) Switch the data from the existing tables into the partitioned table. 3)将数据从现有表切换到分区表。 4) Collapse the two partitions into one partition. 4)将两个分区合为一个分区。 5) Switch the remaining partition into your new target table. 5)将剩余的分区切换到新的目标表中。 6) Drop all the working objects. 6)放下所有工作物体。 7) Repeat as needed. 7)根据需要重复。

-- Step 0.
-- Standard issue pre-cleaning.
IF OBJECT_ID('dbo.OldData','U') IS NOT NULL
  DROP TABLE dbo.OldData;

IF OBJECT_ID('dbo.NewData','U') IS NOT NULL
  DROP TABLE dbo.NewData;

IF OBJECT_ID('dbo.CleanUp','U') IS NOT NULL
  DROP TABLE dbo.CleanUp;

IF OBJECT_ID('dbo.AllData','U') IS NOT NULL
  DROP TABLE dbo.AllData;

IF EXISTS (SELECT * FROM sys.partition_schemes
    WHERE name = 'psCleanUp')
  DROP PARTITION SCHEME psCleanUp;

IF EXISTS (SELECT * FROM sys.partition_functions  
    WHERE name = 'pfCleanUp') 
  DROP PARTITION FUNCTION pfCleanUp;

-- Mock up your existing situation. Two data tables.

CREATE TABLE dbo.OldData
(
  [Dates] DATE NOT NULL
  ,[OtherStuff] VARCHAR(1) NULL
);

CREATE TABLE dbo.NewData
(
  [Dates] DATE NOT NULL
  ,[OtherStuff] VARCHAR(1) NULL
);

INSERT INTO dbo.OldData
  (
    Dates
   ,OtherStuff
  )
VALUES
  (
    '20090101' -- Dates - date
   ,''        -- OtherStuff - varchar(1)
  );

INSERT INTO dbo.NewData
  (
    Dates
   ,OtherStuff
  )
VALUES
  (
    '20110101' -- Dates - date
   ,''        -- OtherStuff - varchar(1)
  )


-- Step .5
-- Here's where the solution starts.
-- Add check contraints to your existing tables.
-- The partition switch will require this to be sure
-- the incoming data works with the partition scheme.

ALTER TABLE dbo.OldData
ADD CONSTRAINT ckOld CHECK (Dates < '2010-01-01');

ALTER TABLE dbo.NewData
ADD CONSTRAINT ckNew CHECK (Dates >= '2010-01-01');

-- Step 1.
-- Create your partitioning artifacts and 
-- intermediate table. 

CREATE PARTITION FUNCTION pfCleanUp (DATE)
AS RANGE RIGHT FOR VALUES ('2010-01-01');

CREATE PARTITION SCHEME psCleanUp
AS PARTITION pfCleanUp
ALL TO ([PRIMARY]);

CREATE TABLE dbo.CleanUp
(
  [Dates] DATE NOT NULL
  ,[OtherStuff] VARCHAR(1) NULL
) ON psCleanUp(Dates);

-- Step 2.
-- Create your new target table.

CREATE TABLE dbo.AllData
(
  [Dates] DATE NOT NULL
  ,[OtherStuff] VARCHAR(1) NULL
);

-- Step 3.
-- Start flopping metadata around.

ALTER TABLE dbo.OldData 
SWITCH TO dbo.CleanUp PARTITION 1;

ALTER TABLE dbo.NewData
SWITCH TO dbo.CleanUp PARTITION 2;

-- Step 4.
-- Your old tables should be empty now.
-- Put all of the data into one partition.

ALTER PARTITION FUNCTION pfCleanUp()
MERGE RANGE ('2010-01-01');

-- Step 5.
-- Switch that partition out to your
-- spanky new table.

ALTER TABLE dbo.CleanUp 
SWITCH PARTITION 1 TO dbo.AllData;

-- Verify the data's where it belongs.

SELECT *
FROM dbo.AllData;

-- Verify the data's not where it shouldn't be.

SELECT * FROM dbo.OldData;
SELECT * FROM dbo.NewData;
SELECT * FROM dbo.CleanUp ;

-- Step 6.
-- Clean up after yourself.

DROP TABLE dbo.OldData;
DROP TABLE dbo.NewData;
DROP TABLE dbo.CleanUp;
DROP PARTITION SCHEME psCleanUp;
DROP PARTITION FUNCTION pfCleanUp;
-- This one's just here for me.
DROP TABLE dbo.AllData;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM