检查重复记录的最佳方法

Question

I have two tables A and B with a relationship of One-to-many from A to B . 我有两个表A和B ，关系从A到B一对多。

A has 5 columns: A有5列：

a1, a2, a3, a4, a5

and B has 5 columns B有5列

b1, b2, b3, b4, a1.

Note a1 is foreign key in table B. 注意a1是表B中的外键。

I have a requirement to check duplicate records in the table ie no two records should have exactly same values for all the attributes. 我有一个要求检查表中的重复记录，即所有属性的任何两个记录都不应具有完全相同的值。

The most efficient way I can think of for determining their uniqueness is by creating a checksum sort of value and keep it in every row of table A. But this requires extra space plus I will have to make sure that the checksum is really unique. 我能想到的最有效的方法是确定它们的唯一性，方法是创建一个校验和类型的值并将其保留在表A的每一行中。但这需要额外的空间，而且我将必须确保校验和确实是唯一的。

Is this the best way to go ahead or is there some other way I am unaware of? 这是前进的最佳方式，还是我不知道其他方式？

For eg Lets say table A is Rules Table and Table B is Trigger table. 例如，假设表A是Rules表，表B是Trigger表。 Now Rules table has records of various rules created by different users.(This means that there will be a mapping to Users Table in Rules Table.). 现在， Rules表记录了由不同用户创建的各种规则（这意味着将在Rules表中映射到Users表）。 Now what I actually want is that a user should not be able to create identical rules. 现在，我真正想要的是用户不应该能够创建相同的规则。 So when a user saves rules I run a query to check if there is record of identical checksum for this particular user if yes then I give the appropriate error otherwise I let the user to create the record.I guess this clears that why I can't put unique constraint on all records. 因此，当用户保存规则时，我将运行查询以检查是否有针对该特定用户的相同校验和的记录（如果是），然后给出适当的错误，否则我将让用户创建该记录。我想这清楚了为什么我可以•对所有记录施加唯一约束。

Answer 1

Do a SELECT with a GROUP BY clause. 使用GROUP BY子句执行SELECT。 For example: 例如：

SELECT a1, a2, a3, a4, a5, COUNT(*) FROM #TempPersons GROUP BY a1, a2, a3, a4, a5 HAVING COUNT(*) > 1;

This will return a result with the a1, a2, a3, a4, a5 and a count of how many times that value appears 这将返回a1，a2，a3，a4，a5的结果以及该值出现多少次的计数

Answer 2

Having a UNIQUE constraint on those columns seems like the way to go. 在这些列上具有UNIQUE约束似乎是可行的方法。

However, just for the sake of answering your other remarks: I've worked with extra columns to check for changes in the past before. 但是，仅是为了回答您的其他意见：我以前使用过额外的列来检查过去的更改。 Back then I did something similar to this: 那时我做了类似的事情：

CONVERT([NVARCHAR](42),HASHBYTES('SHA1',CONCAT(Column1, '||', Column2, ...),(1))

I found it to be a rather nice way to concat many columns into a single hash, unique depending on it's contents & without it blowing out of proportion. 我发现这是将许多列合并为单个哈希的一种不错的方法，根据其内容而唯一，并且不会超出比例。 (I used this in a datawarehousing environment, to check large tables for record level changes based on a business key. Stored this as a PERSISTED column to allow an index to run on this too). （我在数据仓库环境中使用了此功能，用于根据业务密钥检查大型表的记录级别更改。将其存储为PERSISTED列，以允许索引也基于此键运行）。

检查重复记录的最佳方法

问题描述

2 个解决方案

解决方案1
1 2016-09-22 12:56:47

解决方案2
1 2016-09-22 13:04:03

检查重复记录的最佳方法

问题描述

2 个解决方案

解决方案1 1 2016-09-22 12:56:47

解决方案2 1 2016-09-22 13:04:03

解决方案1
1 2016-09-22 12:56:47

解决方案2
1 2016-09-22 13:04:03