[英]ETL as a transaction
For all the ETLs I have written so far, I have never made them transactions - ie if table 4 fails, roll everything back. 到目前为止,对于我已经编写的所有ETL,我都从未进行过事务处理-例如,如果表4失败,请回滚所有内容。
What is the best practice in this regard? 在这方面的最佳实践是什么?
To "BeginTran + Commit" or not to "BeginTran + Commit" 要“ BeginTran +提交”还是不“ BeginTran +提交”
EDIT: I have one master package calling 4 other packages - is it possible to roll them all up into one transaction? 编辑:我有一个主程序包调用其他4个程序包-是否可以将它们全部汇总为一个事务?
In SSIS, I always Begin Trans + Commit
. 在SSIS中,我总是
Begin Trans + Commit
。 I want to make sure that I can re-run the package without issue (or having to find what rows actually got inserted) if it fails. 我想确保如果失败,我可以重新运行该程序包而不会出现问题(或必须查找实际插入了哪些行)。
It just makes recovery and cleanup so much easier. 它只是使恢复和清理变得如此容易。
begin+commit in manageable batch sizes. 开始+提交可管理的批量大小。 You don't want to wrap a 6 hours import into a single transaction every night.
您不想每晚将6个小时的导入交易打包成一个交易。 Keep your batches at a size that can finish in 2-3 minutes at most.
使批次的大小最多可在2-3分钟内完成。 That you will hit data purity issues that will fail an ETL is a given, so at least reduce the impact to something manageable (ie. don't trigger a rollback that will last another 6 hours to complete).
你会打的将失败的ETL是给定的,因此至少减少的影响,一些管理的数据纯度问题(即不触发回滚,将持续6小时才能完成)。
You are often moving too much data in ETL to use a SQL transaction (the log has to store ALL the data to roll back, remember). 您经常在ETL中移动太多数据而无法使用SQL事务(请记住,日志必须存储所有数据才能回滚)。 I prefer to design packages such that they can be re-run nondestructively.
我更喜欢设计软件包,以便可以无损地重新运行它们。 Ideally they should be set up so that if they die in mid-stream, you can just start them and they'll continue somewhere approximately where they left off.
理想情况下,应该对它们进行设置,以便如果它们在中途死亡,则只需启动它们,它们就会在大约停止的地方继续运行。 Sometimes there's a performance penalty for that, but I think it's worth it.
有时会为此降低性能,但我认为这是值得的。
Technically you can roll packages up into a single transaction; 从技术上讲,您可以将包裹汇总为单个交易; practically, maybe not.
实际上,也许不是。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.