简体繁体 English

SQL以PK FK关系递归地复制来自多个表的行

[英]SQL recursively copy rows from multiple tables following PK FK relationships

原文 2012-03-09 21:59:50 4 1 c#/ sql/ database/ recursion

I was given the task of creating a stored procedure to copy every piece of data associated with a given ID in our database. 我的任务是创建一个存储过程来复制与数据库中给定ID相关的每一段数据。 This data spans dozens of tables. 这些数据涵盖了数十个表格。 each table may have dozens of matching rows. 每个表可能有几十个匹配的行。

example: 例：

table Account 表帐户
pk = AccountID pk = AccountID

Table AccountSettings 表帐户设置
FK = AccountID FK = AccountID

Table Users 表用户
PK = UserID PK = UserID
FK = AccountID FK = AccountID

Table UserContent 表UserContent
PK = UserContentID PK = UserContentID
FK = UserID FK = UserID

I want to create a copy of everything that is associated with an AccountID(which will traverse nearly every table) The copy will have a new AccountID and UserContentID but will have the same UserID. 我想创建一个与AccountID相关联的所有内容的副本（几乎遍历每个表）。副本将具有新的AccountID和UserContentID，但具有相同的UserID。 the new data needs to be in its respective table. 新数据需要在各自的表中。 :) fun right? :)好玩吧？

The above is just a sample but I will be doing this for something like 50 or 60 tables. 以上只是一个示例，但我将在50或60个表格中执行此操作。 I have researched using CTEs but am still a bit foggy on them. 我已经研究过使用CTE，但我们仍然有点模糊。 that may prove to be the best method. 这可能是最好的方法。 MY SQL skills are...... well I have worked with it for about 40 logged hours so far :) 我的SQL技能......我到目前为止已经使用它约40个小时:)

Any advice or direction on where to look would be greatly appreciated. 任何关于在哪里寻找的建议或方向将不胜感激。 In addition, I am not opposed to doing this via C# if that would be possible or better. 另外，我不反对通过C＃这样做，如果可能或更好。

Thanks in advance for any help of info. 提前感谢您提供任何信息帮助。

1 个解决方案

The simplest way to solve this is the brute force way: write a very long proc that processes each table individually. 解决这个问题的最简单方法是强力方式：编写一个非常长的proc，分别处理每个表。 This will be error-prone and very hard to maintain. 这将容易出错并且很难维护。 But it will have the advantage of not relying on the database or database metadata to be in any particularly consistent state. 但它的优点是不依赖于数据库或数据库元数据处于任何特别一致的状态。

If you want something that works based on metadata, things are more interesting. 如果你想要一些基于元数据的东西，事情会更有趣。 You have three challenges there: 那里有三个挑战：

You need to programmatically identify all the related tables. 您需要以编程方式识别所有相关表。
You need to generate insert statements for all 50 or 60. 您需要为所有50或60生成插入语句。
You need to capture generated ids for those tables that are more than one or two steps away from the Account table, so that they can in turn be used as foreign keys in yet more copied records. 您需要为距离Account表超过一步或两步的表捕获生成的ID，以便它们可以在更多复制的记录中用作外键。

I've looked at this problem in the past, and while I can't offer you a watertight algorithm, I can give you a general heuristic. 我过去看过这个问题，虽然我不能为你提供防水算法，但我可以给你一个通用的启发式算法。 In other words: this is how I'd approach it. 换句话说：这就是我接近它的方式。

Using a later version of MS Entity Framework (you said you'd be open to using C#), build a model of the Account table and all the related tables. 使用更高版本的MS Entity Framework（您说您可以使用C＃），构建Account表的模型和所有相关表。
Review the heck out of it. 回顾一下它。 If your database is like many, some of the relationships your application(s) assume will, for whatever reason, not have an actual foreign key relationship set up in the database. 如果您的数据库很多，那么您的应用程序所假设的某些关系，无论出于何种原因，都不会在数据库中设置实际的外键关系。 Create them in your model anyway. 无论如何，在你的模型中创建它们。
Write a little recursive routine in C# that can take an Account object and traverse all the related tables. 在C＃中编写一个小的递归例程，它可以获取Account对象并遍历所有相关的表。 Pick a couple of Account instances and have it dump table name and key information to a file. 选择几个Account实例并将其转储表文件和密钥信息到文件中。 Review that for completeness and plausibility. 检查完整性和合理性。
Once you are satisfied you have a good model and a good algorithm that picks up everything, it's time to get cracking on the code. 一旦你满意，你就拥有了一个很好的模型和一个很好的算法来获取所有东西，是时候破解代码了。 You need to write a more complicated algorithm that can read an Account and recursively clone all the records that reference it. 您需要编写一个更复杂的算法，该算法可以读取帐户并递归克隆引用它的所有记录。 You will probably need reflection in order to do this, but it's not that hard: all the metadata that you need will be in there, somewhere. 你可能需要反思才能做到这一点，但并不是那么难：你需要的所有元数据都会在那里，某处。
Test your code. 测试你的代码。 Allow plenty of time for debugging. 留出足够的时间进行调试。
Use your first algorithm, in step 3, to compare results for completeness and accuracy. 在步骤3中使用您的第一个算法来比较结果的完整性和准确性。

The advantage of the EF approach: as the database changes, so can your model, and if your code is metadata-based, it ought to be able to adapt. EF方法的优点：随着数据库的变化，您的模型也会发生变化，如果您的代码是基于元数据的，那么它应该能够适应。

The disadvantage: if you have such phenomena as fields that are "really" the same but are different types, or complex three-way relationships that aren't modeled properly, or embedded CSV lists that you'd need to parse out, this won't work. 缺点：如果您有“真正”相同但不同类型的字段，或者未正确建模的复杂三向关系，或者您需要解析的嵌入式CSV列表等现象，不行。 It only works if your database is in good shape and is well-modeled. 它只适用于您的数据库状态良好且模型良好的情况。 Otherwise you'll need to resort to brute force. 否则你需要诉诸蛮力。