简体   繁体   English

SQL 2014-SSIS或存储过程可将数据从SQL Server复制到具有相同表结构的SQL Server

[英]SQL 2014 - SSIS or Stored procedures to copy data from SQL Server to SQL Server with same table structure

We have an SSIS project to load data from CSV to a staging area on SQL Server (DB_Stage). 我们有一个SSIS项目,用于将数据从CSV加载到SQL Server的登台区域(DB_Stage)。

The main purpose of Staging is to prepare the data ready to move to the Production database(DB_Prod) and in the process flag any errors with data or files. 暂存的主要目的是准备准备移至生产数据库(DB_Prod)的数据,并在过程中标记数据或文件中的任何错误。

DB_Stage is created by taking create table scripts from DB_Prod, so the table structure in both databases is same. 通过从DB_Prod获取创建表脚本来创建DB_Stage,因此两个数据库中的表结构是相同的。 Once the load to DB_Stage is successful then data need to be moved to DB_Prod. 一旦成功加载到DB_Stage,则需要将数据移至DB_Prod。

I'm thinking to create a stored procedure for each table in DB_Stage to push data to DB_Prod as there is no transformation required and also thinking SQL to SQL is faster. 我正在考虑为DB_Stage中的每个表创建一个存储过程,以将数据推送到DB_Prod,因为不需要进行转换,并且还认为从SQL到SQL的速度更快。 However I read some articles which say SSIS has the capacity to parallel processing and load will be much faster. 但是,我读了一些文章,说SSIS具有并行处理的能力,并且加载速度会快得多。 But I didn't understand it completely. 但是我不完全理解。

I can create another set of SSIS packages to move data from Stage to Prod using Biml with in no time. 我可以创建另一组SSIS包,以立即使用Biml将数据从舞台转移到产品。 But I need some advice on which is the best approach. 但是我需要一些建议,哪种才是最好的方法。 Stored procedures or SSIS packages in my scenario. 在我的方案中存储过程或SSIS包。

One advantage if I use SSIS package is I can configure the destination database, so Stage data can be loaded to any server/database (this is a requirement for us). 如果我使用SSIS包,则一个优点是可以配置目标数据库,因此可以将Stage数据加载到任何服务器/数据库中(这是我们的要求)。

If I use stored procedures I don't find a way to parametarize the target database. 如果使用存储过程,则找不到找到目标数据库参数的方法。 It seems I must hard code this way... 看来我必须这样硬编码...

insert into Prod_DB.dbo.Table1(col list) select (col list) from DB_Stage.dbo.table1. 插入Prod_DB.dbo.Table1(列列表),从DB_Stage.dbo.table1中选择(列列表)。

Any help would be greatly appreciated. 任何帮助将不胜感激。

As you have mentioned that you don't have any transformation to use in between dev and prod environment. 如前所述,在开发环境和生产环境之间没有任何转换可使用。 I would recommend you to use SSIS instead of stored procedure. 我建议您使用SSIS而不是存储过程。 SSIS will consider this as a synchronous task and will start transferring record as soon as it can. SSIS会将其视为同步任务,并将尽快开始传输记录。 SSIS can take advantage of buffer pipe line which you can control and achieve parallelism. SSIS可以利用缓冲管线来控制和实现并行性。

I would recommend few setting to be done when using SSIS in this case: 在这种情况下,我建议在使用SSIS时要进行的设置很少:

  • Avoid table locks on destination 避免在目标上锁定表
  • Adjust max. 调整最大 number of rows & max. 行数和最大 row commit size 行提交大小
  • If you are planning to transfer data for multiple table at once then set max. 如果您打算一次传输多个表的数据,请设置最大值。 thread and buffer accordingly. 线程和相应的缓冲区。

I am pretty sure you will see performance gain by using SSIS instead of TSQL here. 我很确定您将在这里使用SSIS而不是TSQL看到性能提高。

As you have mentioned, there are 2 ways to copy the data from 1 server to another. 如前所述,有两种方法可以将数据从一台服务器复制到另一台服务器。 Let's go through them one by one. 让我们一一介绍。

Stored Procedure: You will have to first create a linked server connection between the Prod server and the Staging server by using sp_addlinkedserver . 存储过程:您必须首先使用sp_addlinkedserver在Prod服务器和登台服务器之间创建链接服务器连接。 This would allow you to use 4 part naming to reference tables from Staging Server like [ProdServer].[ProdDB].[dbo].[Table1] Here you can make use of what one calls a ' dynamic query '. 这将允许您使用4部分命名来引用登台服务器中的表,例如[ProdServer]。[ProdDB]。[dbo]。[Table1]。在这里,您可以利用所谓的“ 动态查询 ”。 In this particular kind of query, we can specify certain values of a SQL Query as varchar variables and then execute the query. 在这种特殊的查询中,我们可以将SQL查询的某些值指定为varchar变量,然后执行查询。
What you will be essentially writing is a query as follows: 您将要编写的基本上是一个查询,如下所示:

'INSERT INTO [' + @ProdServer + '].[' + @ProdDB +'].[dbo].[Table1] (col list)
SELECT (col list) FROM [DB_Stage].[dbo].[table1]'

In here you have the condition of 4,000 characters in case of nvarchar strings, or 8,000 characters in case of varchar strings. 在这里,对于nvarchar字符串,条件为4,000个字符,对于varchar字符串,条件为8,000个字符。

SSIS: SSIS:
As you have mentioned SSIS allows you to parallelize the data flow from your staging server to Prod Server. 如前所述,SSIS允许您并行化从登台服务器到Prod Server的数据流。 The methods are quite straight forward as explained here . 该方法是相当简单的解释在这里 However, if the table is too large, I would suggest you to use Balanced Data Distributor , which is an optimization on the parallel data flow. 但是,如果表太大,建议您使用Balanced Data Distributor ,这是对并行数据流的优化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM