[英]How to increase the performance for inserting data to an table in SSIS
I have a text file containing 3 Lakh (300,000) records with 7 columns.我有一个包含 7 列的30 万(300,000)条记录的文本文件。 I insert data into a staging table, then perform business logic to insert the data into multiple tables from there.
我将数据插入临时表,然后执行业务逻辑以将数据从那里插入到多个表中。
The file looks like this:该文件如下所示:
01|000001111|27/04/2011|12/01/2012|ISDF|AB|1
02|000002222|09/01/2010|29/01/2010|CfGH|CV|1
03|000003333|19/07/2005|09/07/2007|TBRF|CC|1
The staging table is called Stagetable
.登台表称为
Stagetable
。
I am using flat file source and OLEDB destination its taking more than 7 hours to insert data just into the staging table.我正在使用平面文件源和 OLEDB 目标,它需要 7 个多小时才能将数据插入暂存表。 I need to improve the performance.
我需要提高性能。
I am using SSIS package for the first time.我是第一次使用 SSIS package。
Any suggestions on how I could improve the performance would be great.任何关于我如何提高性能的建议都会很棒。
Thanks Prince谢谢王子
The following should help;以下应该有所帮助;
-Move your file close to your target (ie on the same box) before processing to avoid.network latency - 在处理之前将文件移近目标(即在同一个盒子上)以避免网络延迟
-Deploy and run your package on from within SQL Server, not Visual Studio -从 SQL 服务器而非 Visual Studio 中部署并运行 package
-Use SQL Server destination, not the OLE DB as destination for quicker loading -使用 SQL 服务器目标,而不是 OLE DB 作为更快加载的目标
-Switch off constraint checking on destination component -关闭目标组件的约束检查
-Check that the target table does not have expensive triggers occurring on load -检查目标表没有在加载时发生昂贵的触发器
-Consider using the bulk insert task if you do not need to perform any transformation - 如果不需要执行任何转换,请考虑使用批量插入任务
3 lacs is only 300000 and the fields/row does not look particularly wide either. 3 lacs 只有 300000,字段/行看起来也不是特别宽。 This is pretty small number to cause any problem.
这是一个很小的数字,不会引起任何问题。
Try to find out where is the bottleneck - at the source or the destination?试着找出瓶颈在哪里——源头还是目的地? If source is the bottle neck, you can split the file and stage them into separate but identical staging tables such as stg_1, stg_2 etc., and then load them in parallel.
如果源是瓶颈,你可以拆分文件并将它们暂存到单独但相同的暂存表中,如 stg_1、stg_2 等,然后并行加载它们。
If destination is the bottleneck, you can use Balanced data distributor, and put them into different staging tables.如果目的地是瓶颈,您可以使用平衡数据分发器,并将它们放入不同的暂存表中。
Again, all this seems to be an overkill for just 300K rows.同样,所有这些对于仅 30 万行来说似乎都有些矫枉过正。
Make sure you do not have any indexes on the staging table(s).确保临时表上没有任何索引。 We would love to hear from you what all you tried, what worked, and how you resolved.
我们很乐意听取您的意见,您尝试了哪些方法、哪些方法奏效了,以及您是如何解决的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.