简体   繁体   English

在外键更新时忽略级联?

[英]Ignore cascade on foreign key update?

To preface, I'm not very experienced with database design. 作为前言,我对数据库设计不是很有经验。 I have a table of hashes and ids. 我有一张哈希和ids表。 When a group of new hashes are added, each row in the group gets the same id. 添加一组新哈希时,组中的每一行都会获得相同的ID。 If any hash within the new group already exists in the database, all hashes in the new group and existing group(s) get a new, shared id (effectively merging ids when hashes are repeated): 如果数据库中已存在新组中的任何哈希,则新组和现有组中的所有哈希都会获得一个新的共享ID(在重复哈希时有效地合并ID):

INSERT INTO hashes 
    (id, hash) 
VALUES 
    ($new_id, ...), ($new_id, ...)
ON DUPLICATE KEY UPDATE 
    repeat_count = repeat_count + 1;

INSERT INTO hashes_lookup SELECT DISTINCT id FROM hashes WHERE hash IN (...);
UPDATE hashes JOIN hashes_lookup USING (id) SET id = '$new_id';
TRUNCATE TABLE hashes_lookup;

Other tables reference these ids, so that if an id changes, foreign key constraints take care of updating the ids across tables. 其他表引用了这些id,因此如果id发生更改,则外键约束会负责更新表中的id。 The issue here, however, is that I can't enforce uniqueness across any of the child tables. 但是,这里的问题是我无法在任何子表中强制实现唯一性。 If I do, my queries fail with: 如果我这样做,我的查询失败:

Foreign key constraint for table '...', record '...' would lead to a duplicate entry in table '...' 表'...'的外键约束,记录'...'将导致表'...'中的重复条目

This error makes sense, given the following test case where id and value are a composite unique key: 考虑到以下测试用例,其中idvalue是复合唯一键,这个错误是有意义的:

id | value
---+-------
a  | 1
b  | 2
c  | 1

Then a gets changed to c : 然后a变为c

id | value
---+-------
c  | 1
b  | 2
c  | 1

But c,1 already exists. 但是, c,1已经存在。

It would be ideal if there was an ON UPDATE IGNORE CASCADE option, so that if a duplicate row exists, any duplicating inserts are ignored. 如果存在ON UPDATE IGNORE CASCADE选项,那将是理想的,因此如果存在重复行,则忽略任何重复插入。 However, I'm pretty sure the real issue here is my database design, so I am open to any and all suggestions. 但是,我很确定这里真正的问题是我的数据库设计,所以我对所有建议都持开放态度。 My current solution is to not enforce uniqueness across child tables, which leads to a lot of redundant rows. 我目前的解决方案是不强制跨子表的唯一性,这会导致大量冗余行。

Edit: 编辑:

CREATE TABLE `hashes` (
 `hash` char(64) NOT NULL,
 `id` varchar(128) NOT NULL,
 `repeat_count` int(11) NOT NULL DEFAULT '0',
 `insert_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
 `update_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
 UNIQUE KEY `hash` (`hash`) USING BTREE,
 KEY `id` (`id`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `emails` (
 `id` varchar(128) NOT NULL,
 `group_id` char(5) NOT NULL,
 `email` varchar(500) NOT NULL,
 KEY `index` (`id`) USING BTREE,
 UNIQUE KEY `id` (`id`,`group_id`,`email`(255)) USING BTREE,
 CONSTRAINT `emails_ibfk_1` FOREIGN KEY (`id`) REFERENCES `hashes` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1

I think will be good to create table hash_group to store id of hash group: 我认为创建表hash_group来存储哈希组的id会很好:

CREATE TABLE `hash_group` (
 `id` BIGINT AUTO_INCREMENT NOT NULL,
 `group_name` varchar(128) NOT NULL,
 `insert_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
 `update_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
 UNIQUE KEY `group_name` (`group_name`) USING BTREE,
 PRIMARY KEY (`id`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

And change structure of existing tables: 并改变现有表的结构:

CREATE TABLE `hashes` (
 `hash` char(64) NOT NULL,
 `hash_group_id` BIGINT NOT NULL,
 `repeat_count` int(11) NOT NULL DEFAULT '0',
 `insert_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
 `update_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
 UNIQUE KEY `hash` (`hash`) USING BTREE,
 KEY `hashes_hash_group_id_index` (`hash_group_id`) USING BTREE,
 CONSTRAINT `hashes_hash_group_id_fk` FOREIGN KEY (`hash_group_id`) REFERENCES `hash_group` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

CREATE TABLE `emails` (
 `hash_group_id` BIGINT NOT NULL,
 `group_id` char(5) NOT NULL,
 `email` varchar(500) NOT NULL,
 KEY `emails_hash_group_id_index` (`hash_group_id`) USING BTREE,
 UNIQUE KEY `emails_unique` (`hash_group_id`,`group_id`,`email`(255)) USING BTREE,
 CONSTRAINT `emails_ibfk_1` FOREIGN KEY (`hash_group_id`) REFERENCES `hash_group` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Also create trigger to update hash group if you need to do it: 如果需要,还可以创建更新哈希组的触发器:

DELIMITER $$
CREATE TRIGGER `update_hash_group_name` AFTER UPDATE ON `hashes`
FOR EACH ROW
BEGIN
    UPDATE `hash_group` 
    SET `group_name` = md5(now()) -- replace to you hash formula
    WHERE id = NEW.hash_group_id;
END;$$
DELIMITER ;

And create function for getting actual group id: 并创建获取实际组ID的函数:

DROP FUNCTION IF EXISTS get_hash_group;

DELIMITER $$
CREATE FUNCTION get_hash_group(id INT) RETURNS INT
BEGIN
  IF (id IS NULL) THEN
    INSERT INTO `hash_group` (`group_name`) 
    VALUES (md5(now())); -- replace to you hash
    RETURN LAST_INSERT_ID();
  END IF;

  RETURN id;
END;$$
DELIMITER ;

Scenario: 场景:

Initial fill: 初始填写:

INSERT INTO `hash_group` (id, group_name) VALUES 
(1, 'test1'),
(2, 'test2'),
(3, 'test3');

INSERT INTO `hashes` (hash, hash_group_id) VALUES
('hash11', 1),
('hash12', 1),
('hash13', 1),
('hash2', 2),
('hash3', 3);

INSERT INTO `emails` (hash_group_id, group_id, email)
VALUES
(1, 'g1', 'example1@'),
(2, 'g1', 'example2@'),
(3, 'g1', 'example2@');

Updating of hash_group scenario: 更新hash_group场景:

START TRANSACTION;

-- Get @min_group_id - minimum group id (we will leave this id and delete other)

SELECT MIN(hash_group_id) INTO @min_group_id
FROM hashes 
WHERE hash IN ('hash11', 'hash12', 'hash2', 'hash15');

-- Replace other group ids in email table to @min_group_id

UPDATE `emails` 
SET `hash_group_id` = @min_group_id
WHERE `hash_group_id` IN (
  SELECT hash_group_id
  FROM hashes 
  WHERE @min_group_id IS NOT NULL
  AND hash IN ('hash11', 'hash12', 'hash2', 'hash15')
  -- Update only if we are gluy several hash_groups
  AND `hash_group_id` > @min_group_id
);

-- Delete other hash_groups and leave only group with @min_group_id

DELETE FROM `hash_group` WHERE `id` IN (
  SELECT hash_group_id
  FROM hashes 
  WHERE @min_group_id IS NOT NULL
  AND hash IN ('hash11', 'hash12', 'hash2', 'hash15')
  -- Delete only if we are gluy several hash_groups
  AND `hash_group_id` > @min_group_id
);

-- @group_id = existing hash_group.id or create new if @min_group_id is null (all inserted hashes are new)

SELECT get_hash_group(@min_group_id) INTO @group_id;

-- Now we can insert new hashes.

INSERT INTO `hashes` (hash, hash_group_id) VALUES
('hash11', @group_id),
('hash12', @group_id),
('hash2', @group_id),
('hash15', @group_id)
ON DUPLICATE KEY 
UPDATE repeat_count = repeat_count + 1;


COMMIT;

I maybe wrong but I think you mis-named the id field in hashes . 我可能错了,但我认为你在hashes错误地命名了id字段。

I think you should rename the id field in hashes to something like group_id , then have a AUTO_INCREMENT field called id that should also be PRIMARY in hashes that the id in emails refers to this field instead. 我认为你应该将hashesid字段重命名为group_id ,然后有一个名为idAUTO_INCREMENT字段,该字段在hashes中也应该是主要的,而emails中的id引用此字段。 When you want to update and relate all the hashes together, you update the group_id field instead of id , and id remains unique across the table. 如果要更新所有哈希并将其关联在一起,则更新group_id字段而不是id ,并且id在整个表中保持唯一。

This way you can avoid the cascade problem, also you will always know the original hash that the email was referring to. 这样您就可以避免级联问题,也可以始终知道电子邮件所指的原始哈希值。 Sure, if you want to fetch all the hashes related to an email (old and the new) you must exectue and extra query, but I think it solves all your problems. 当然,如果你想要获取与电子邮件(旧的和新的)相关的所有哈希,你必须执行和额外的查询,但我认为它解决了所有问题。

Edit: 编辑:
you can use a trigger to do this 你可以使用触发器来做到这一点

The trigger goes like this 触发器就像这样

DELIMITER $$
CREATE TRIGGER `update_hash_id` AFTER UPDATE ON `hashes`
FOR EACH ROW
BEGIN
    UPDATE `emails` SET `id` = NEW.id WHERE `id` = OLD.id;
END;$$
DELIMITER ;

and you must remove the foreign key relation too. 你也必须删除外键关系。

Adding an extra integer column to each of the child tables would avoid this problem altogether by using it as a primary key. 向每个子表添加一个额外的整数列可以通过将其用作主键来完全避免此问题。 The key never changes because it isn't a reference to anything else. 密钥永远不会改变,因为它不是对其他任何东西的引用。

Using composite keys as primary keys is generally something that you want to avoid. 使用复合键作为主键通常是您要避免的。 And considering that this key combination is not always unique, I would definitely say you need a dedicated primary key in all of your child tables with this problem. 考虑到这个组合键并不总是唯一的,我肯定会说你需要在所有子表中都有一个专用的主键来解决这个问题。

You can even auto increment it so you aren't manually assigning it every time. 您甚至可以自动递增它,这样您就不会每次都手动分配它。 For example.. 例如..

Create Table exampleTable
(   
    trueID int NOT NULL AUTO_INCREMENT,
    col1 int NOT NULL,
    col2 varChar(50)
    PRIMARY KEY(trueID)
)

Then, when two of the rows in a child table are set with identical values (for whatever reason), the primary key stays unique, preventing any conflicts in the Database that could arise. 然后,当子表中的两个行设置为相同的值(无论出于何种原因)时,主键保持唯一,从而防止可能出现的数据库中的任何冲突。

The solution, which we have arrived in chat chat : 解决方案,我们已经进入聊天聊天

/* Tables */

CREATE TABLE `emails` (
 `group_id` bigint(20) NOT NULL,
 `email` varchar(500) NOT NULL,
 UNIQUE KEY `group_id` (`group_id`,`email`) USING BTREE,
 CONSTRAINT `emails_ibfk_1` FOREIGN KEY (`group_id`) REFERENCES `entities` (`group_id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `hashes` (
 `group_id` bigint(20) NOT NULL,
 `hash` varchar(128) NOT NULL,
 `repeat_count` int(11) NOT NULL DEFAULT '0',
 UNIQUE KEY `hash` (`hash`),
 KEY `group_id` (`group_id`),
 CONSTRAINT `hashes_ibfk_1` FOREIGN KEY (`group_id`) REFERENCES `entities` (`group_id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `entities` (
 `group_id` bigint(20) NOT NULL,
 `entity_id` bigint(20) NOT NULL,
 PRIMARY KEY (`group_id`),
 KEY `entity_id` (`entity_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `entity_lookup` (
 `entity_id` bigint(20) NOT NULL,
 PRIMARY KEY (`entity_id`) USING HASH
) ENGINE=MyISAM DEFAULT CHARSET=latin1

/* Inserting */

START TRANSACTION;

/* Determine next group ID */
SET @next_group_id = (SELECT MAX(group_id) + 1 FROM entities);

/* Determine next entity ID */
SET @next_entity_id = (SELECT MAX(entity_id) + 1 FROM entities);

/* Merge any entity ids */
INSERT IGNORE INTO entity_lookup SELECT entity_id FROM entities JOIN hashes USING(group_id) WHERE HASH IN(...);
UPDATE entities JOIN entity_lookup USING(entity_id) SET entity_id = @next_entity_id;
TRUNCATE TABLE entity_lookup;

/* Add the new group ID to entity_id */
INSERT INTO entities(group_id, entity_id) VALUES(@next_group_id, @next_entity_id);

/* Add new values into hashes */
INSERT INTO hashes (group_id, HASH) VALUES 
    (@next_group_id, ...)
ON DUPLICATE KEY UPDATE
  repeat_count = repeat_count + 1;

/* Add other new values */
INSERT IGNORE INTO emails (group_id, email) VALUES
    (@next_group_id, "email1");

COMMIT;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM