[英]MySQL: Replace all instances of specific foreign key with new value
I have a MySQL database with 1000s of personnel records, often with duplicates. 我有一个MySQL数据库,其中包含数千条人事记录,经常有重复记录。
For each case with at least one duplicate I want to be able to delete all of the duplicates but one, then update any references to those deleted foreign keys with the one I did not. 对于每种情况,至少要有一个重复项,我希望能够删除所有重复项,但要删除一个,然后用我没有的重复项更新对那些已删除外键的引用。
For example, we see two instances of Star Lord
below: 例如,我们在下面看到两个
Star Lord
实例:
+-----------------------+
| `users` |
+------+----------------+
| id | name |
+------+----------------+
| 1 | Star Lord |
+------+----------------+
| 2 | Star Lord |
+------+----------------+
| 3 | Iron Man |
+------+-----+----------+
+-----------------------+
| `messages` |
+------+-----+----------+
| from | to | text |
+------+-----+----------+
| 1 | 5 | hi |
+------+-----+----------+
| 2 | 5 | how r u |
+------+-----+----------+
| 5 | 2 | Good, u? |
+------+-----+----------+
Those two tables should become: 这两个表应变为:
+-----------------------+
| `users` |
+------+----------------+
| id | name |
+------+----------------+
| 1 | Star Lord |
+------+----------------+
| 3 | Iron Man |
+------+-----+----------+
+-----------------------+
| `messages` |
+------+-----+----------+
| from | to | text |
+------+-----+----------+
| 1 | 5 | hi |
+------+-----+----------+
| 1 | 5 | how r u |
+------+-----+----------+
| 5 | 1 | Good, u? |
+------+-----+----------+
Can this be done? 能做到吗? I'm happy to use PHP as needed.
我很高兴根据需要使用PHP。
I found the following, but it's only for finding foreign key usage, not replacing instances for specific key values: MySQL: How to I find all tables that have foreign keys that reference particular table.column AND have values for those foreign keys? 我发现了以下内容,但这仅用于查找外键用法,而不是替换特定键值的实例: MySQL:如何查找所有具有引用特定table.column的外键并具有这些外键值的表?
Bonus Points 奖励积分
There may be additional data which needs to be merged in the users
table. users
表中可能需要合并其他数据。 For example, Star Lord
with ID #1 might have a phone
field filled in, but Star Lord
with ID #2 has an email
field. 例如,ID为#1的
Star Lord
可能填写了phone
字段,而ID为2的Star Lord
有一个email
字段。
Worst case: they both have a field, with conflicting data. 最坏的情况:它们都有一个领域,随着矛盾的数据。
I suggest: 我建议:
Create a table of correct data. 创建一个正确的数据表。 A good starting point might be:
一个好的起点可能是:
CREATE TABLE users_new LIKE users; ALTER TABLE users_new ADD UNIQUE (name); INSERT INTO users_new (id, name, phone, email) SELECT MIN(id), name, GROUP_CONCAT(phone), GROUP_CONCAT(email) FROM users GROUP BY name;
Note that, due to your "worst case" observation under "Bonus Points", you may well want to manually verify the contents of this table before archiving the underlying users
data (I advise against permanent deletion, just in case). 请注意,由于在“奖励积分”下有“最坏情况”的观察,您可能希望在归档基础
users
数据之前手动验证该表的内容(为防止万一,我建议您永久删除)。
Update existing foreign relationships: 更新现有的外国关系:
UPDATE messages JOIN (users uf JOIN users_new unf USING (name)) ON uf.id = messages.from JOIN (users ut JOIN users_new unt USING (name)) ON ut.id = messages.to SET messages.from = unf.id, messages.to = unt.id
If you have a lot of tables to update, you could cache the results of the join between users
and users_new
—either: 如果要更新的表很多,则可以缓存
users
和users_new
之间的users_new
结果-可以:
in a new_id
column within the old users
table: 在旧
users
表的new_id
列中:
ALTER TABLE users ADD new_id BIGINT UNSIGNED; UPDATE users JOIN users_new USING (name) SET users.new_id = users_new.id; UPDATE messages JOIN users uf ON uf.id = messages.from JOIN users ut ON ut.id = messages.to SET messages.from = uf.new_id, messages.to = ut.new_id;
or else in a new (temporary) table: 否则在新的(临时)表中:
CREATE TEMPORARY TABLE newid_cache ( PRIMARY KEY(old_id), KEY(old_id, new_id) ) ENGINE=MEMORY SELECT users.id AS old_id, users_new.id AS new_id FROM users JOIN users_new USING (name); UPDATE messages JOIN newid_cache nf ON nf.old_id = messages.from JOIN newid_cache nt ON nt.old_id = messages.to SET messages.from = nf.new_id, messages.to = nt.new_id;
Either replace users
with users_new
, or else modify your application to use the new table in place of the old one. 用
users_new
替换users
,或者修改您的应用程序以使用新表代替旧表。
ALTER TABLE users RENAME TO users_old; ALTER TABLE users_new RENAME TO users;
Update any foreign key constraints as appropriate. 适当更新任何外键约束。
I like to be really methodical about this, while you could write it all in one complex query, that's an optimisation, and unless it's obvious, an unnecessary one. 我喜欢真的有条不紊,尽管您可以在一个复杂的查询中编写所有代码,但这是一种优化,除非很明显,否则这是不必要的。
First backup your database :) Create a table
to hold the ids of the users you are going to keep. 首先备份数据库:)
Create a table
以保存要保留的用户的ID。
Fill it with say 说满
Insert into Keepers Select keep_id From (Select Min(id) as keep_id,`name` From `users`)
After that it's just some update with joins. 在那之后,它只是一些关于连接的更新。
eg 例如
UPDATE
`messages` m JOIN
keepers k
ON k.keeper_id = m.from
SET m.from = k.keeper_id
UPDATE
`messages` m JOIN
keepers k
ON k.keeper_id = m.to
SET m.to = k.keeper_id
Then get rid of the users you don't want 然后摆脱您不想要的用户
Delete `users`
from `users` u
outer join keepers on k.keeper_id = u.id
where i.id is null
When
all is good eg you have the same number of messages as you started with, no one is talking to themselves etc. When
一切都很好时,例如,您收到的消息数量与开始时的消息数量相同,则没有人在和自己说话。
Delete the keepers table.
Syntax not checked, but it should be close. 语法未选中,但应关闭。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.