简体繁体 English

使用SSIS在多个表中同时从oracle到SQL加载数据

[英]Loading data from oracle to SQL in multiple tables simultaneously using SSIS

原文 2018-08-01 03:29:18 4 2 sql-server/ ssis/ sql-data-warehouse

enter image description here 在此处输入图片说明

I am creating a clinical data warehouse, so I am testing different scenarios. 我正在创建一个临床数据仓库，因此我正在测试不同的场景。 I am loading the below tables from oracle DB (Attunity connector) to SQL DB (OLE DB): 我正在将下表从oracle DB（Attunity连接器）加载到SQL DB（OLE DB）：

Table1 1.2 GB(3 million rows) Table2 20 GB(200 million rows) Table3 100 GB(250 million rows) Table4 25 GB(60 million rows) 表1 1.2 GB（300万行）表2 20 GB（2亿行）表3 100 GB（2.5亿行）表4 25 GB（6000万行）

For my initial load I am planning to use SSIS and just select * from TABLE1/TABLE2/TABLE3/TABLE4 对于我的初始负载，我打算使用SSIS并从TABLE1 / TABLE2 / TABLE3 / TABLE4中选择*

Questions : 问题：

Is it ok to have multiple data flow tasks for loading each table in one package. 可以有多个数据流任务来将一个表加载到一个包中吗？ So that they are all running together. 这样它们都可以一起运行。 i just wanted to improve the speed with that. 我只是想以此来提高速度。 But somehow it is slower than if I run it individually. 但是以某种方式它比我单独运行它要慢。

Also for loading complete tables is "select * from table" a good way? 另外，对于加载完整表，“从表中选择*”是一种好方法吗？ It seems pretty slow!! 似乎很慢！

2 个解决方案

You can have as many parallel data flow tasks executing as the number of processor cores you have minus one. 您所执行的并行数据流任务的数量可以等于处理器内核数减一的数量。 That is, if you are using an octacore processor, the ideal number of parallel tasks is 7 (8 -1 ). 也就是说，如果您使用八核处理器，则并行任务的理想数量是7（8 -1）。 Just put in it different sequence containers(not compulsory,but for the sake of readability) and execute. 只需将其放入不同的序列容器（不是强制性的，而是为了便于阅读）并执行。

You can speed up the data load by adjusting several things like the setting DelayValidation=true and using OPTION ( FAST 10000(or any value,just do some trials)) and also play around with the DefaultBufferSize and DefaultBufferMaxRows until you get the right one. 您可以通过调整一些事情（例如设置DelayValidation=true并使用OPTION ( FAST 10000(or any value,just do some trials))来加快数据加载速度OPTION ( FAST 10000(or any value,just do some trials))使用OPTION ( FAST 10000(or any value,just do some trials))然后使用DefaultBufferSize和DefaultBufferMaxRows直到找到正确的值。 Also, check if the MAXDOP value is not set to 1 int the settings, if you intend to run parallel DFTs. 另外，如果要运行并行DFT，请检查设置中的MAXDOP值是否未设置为1 。

And, NEVER use SELECT * from table_name . 并且，切勿使用SELECT * from table_name 。 List out the column names, * adds additional overhead and can slow down your query considerably. 列出列名， *增加额外的开销，并且可能大大降低查询速度。

Process 1: Using SSMA 过程1：使用SSMA

You can use SQL Server Migration Assistant (SSMA) for Migration the data from Oracle to Sql Server Databases/Schemas/Tables. 您可以使用SQL Server迁移助手（SSMA）将数据从Oracle迁移到Sql Server数据库/架构/表。

This is Open source tool from microsoft for database migration. 这是Microsoft的用于数据库迁移的开源工具。

Microsoft SQL Server Migration Assistant (SSMA) is a tool designed to automate database migration to SQL Server from Microsoft Access, DB2, MySQL, Oracle, and SAP ASE. Microsoft SQL Server迁移助手（SSMA）是一种工具，用于自动将数据库从Microsoft Access，DB2，MySQL，Oracle和SAP ASE迁移到SQL Server。

Process 2: Using SSIS 过程2：使用SSIS

You can also use SQL Server Integration Services (SSIS) package for Migration. 您还可以使用SQL Server集成服务（SSIS）包进行迁移。

Create SSIS package from Import/Export wizard and run the package into command line. 通过导入/导出向导创建SSIS程序包，然后将该程序包运行到命令行中。