简体   繁体   English

MySQL-确保跨多行唯一值的最佳方法

[英]MySQL - Best approach to ensure unique values across multiple rows

I have 3 tables: 我有3张桌子:

Molecule:
  id

Atom:
  id

MoleculeAtom: # Composite primary key
  molecule_id
  atom_id

My goal is to ensure that no combination of atoms which make up a molecule, are repeated. 我的目标是确保不会重复组成一个分子的原子的任何组合。 For example, the water molecule, I would store two rows in the MoleculeAtom table; 例如,对于水分子,我将在MoleculeAtom表中存储两行; 1 row for a hydrogen atom and 1 row for an oxygen atom. 氢原子1排,氧原子1排。 As you can see, I need to ensure that no other molecule has JUST hydrogen and oxygen, even though there may be other molecules which include hydrogen and oxygen. 如您所见,即使有可能存在包含氢和氧的其他分子,我也需要确保没有其他分子具有氢和氧。

At this point I have a query which identifies which molecules includes either hydrogen or oxygen, and only having 2 atoms in the MoleculeAtom table. 在这一点上,我有一个查询,该查询标识哪些分子包含氢或氧,并且在MoleculeAtom表中仅包含2个原子。

SELECT
  m.id, m.name, (SELECT count(*) from molecule_atom where molecule_id = m.id group by molecule_id) as atomCount
FROM
  molecule AS m
INNER JOIN
  molecule_atom AS ma ON ma.molecule_id = m.id
WHERE
  ma.atom_id IN (1,2)
HAVING atomCount = 2;

Which returns (demonstrative snippet): 哪个返回(说明性片段):

+----+----------------------------+-----------+
| id | name                       | atomCount |
+----+----------------------------+-----------+
| 53 | Carbon Dioxide             |         2 |
| 56 | Carbon Monoxide            |         2 |
+----+----------------------------+-----------+

(I know, that both CO and CO2 have the same exact atoms, in differing quantities, but dis-regard that, as I am tracking the quantities as a another column in the same table.) (我知道,CO和CO2都具有相同的精确原子,但原子量不同,但请不要理会,因为我在同一表的另一列中跟踪原子量。)

As of now I am pulling the above results and checking their atom_ids via PHP, which means I have to issue a separate query for each molecule, which seems inefficient, so I was looking to see if it's possible to do this checking using strictly SQL. 到目前为止,我正在获取上述结果并通过PHP检查它们的atom_id,这意味着我必须为每个分子发出单独的查询,这似乎效率不高,因此我想看看是否有可能使用严​​格的SQL进行此检查。

Excuse any mistakes which may be chemical related, it's been a long time since chem101. 打扰可能与化学有关的任何错误,距chem101已有很长时间了。

What you are asking for is a table-level constraint and these are not available in MySQL. 您需要的是表级约束,而这些约束在MySQL中不可用。 In SQL-92 standard, there is ASSERTION , which is actually even more general (a constraint across more than 1 table). 在SQL-92标准中,存在ASSERTION ,它实际上甚至更通用(一个以上表的约束)。 See the asnwers in this question: Why don't DBMS's support ASSERTION for details and for info about some products (MS-Access) that have such functionality with limitations. 请参阅以下问题的解答: 为什么DBMS不支持ASSERTION以获得详细信息以及有关某些功能有限的某些产品(MS-Access)的信息。

In MySQL, you could try with a trigger to imitate such a constraint. 在MySQL中,您可以尝试使用触发器来模仿这种约束。


Update: 更新:

Firebird documentation says it allows subqueries in CHECK constraints. Firebird文档说,它允许在CHECK约束中进行子查询。

As ypercube mentioned, MySQL doesn't support assertions, so I ended writing a query to find all molecules having at least one of the atoms which belong to the new molecule I am trying to create, and having the same number of atoms. 如ypercube所述,MySQL不支持断言,因此我结束了写查询以查找所有具有至少一个原子的分子,这些原子属于我要创建的新分子,并且原子数相同。 After querying for matches, the application steps through each molecule and determines if they have the same exact atoms as the new molecule. 在查询匹配项之后,应用程序逐步遍历每个分子并确定它们是否具有与新分子相同的精确原子。 Query looks like this (assumes I am trying to create a new molecule with 2 atoms): 查询看起来像这样(假设我正在尝试创建一个具有2个原子的新分子):

SELECT 
    m.id,
    m.name,
    (SELECT GROUP_CONCAT(ma.atom_id) FROM molecule_atom AS ma WHERE ma.molecule_id = m.id GROUP BY ma.molecule_id HAVING (SELECT COUNT(ma.atom_id)) = 2) AS atoms
FROM
    molecule AS m
INNER JOIN
    molecule_atom AS mas ON mas.molecule_id = m.id
WHERE 
    mas.atom_id IN (1,2)

Then in code (PHP) I do: 然后在代码(PHP)中执行:

foreach ($molecules as $molecule) {

    if (isset($molecule['atoms'])) {

        $diff = array_diff($newAtomIds, explode(',', $molecule['atoms']));

        // If there is no diff, then we have a match
        if (count($diff) === 0) {
            return $molecule['name'];
        }
    }
}

Thanks for everyone's response. 感谢大家的回应。

A unique index might be helpful on the molecule_atom table. 唯一索引可能对Molecular_atom表有用。 That would prevent duplicates at that level. 这样可以防止在该级别重复。 You're still going to need to do some checks via SQL statements. 您仍然需要通过SQL语句进行一些检查。 Another option depending on the size of your list would be to load it in memory in a hash table and then run the checks from there. 根据列表大小的另一种选择是将其加载到哈希表的内存中,然后从那里运行检查。

The idea here is to find pairs of molecules whose lists of atoms are not the same: 这里的想法是找到原子列表不同的分子对:

select m1.molecule_id as m1id, m2.molecule_id as m2id
from molecule_atom as m1, molecule_atom as m2,
    (select atom_id from molecule_atom as m where m.molecule_id=m1id) as m1a,
    (select atom_id from molecule_atom as m where m.molecule_id=m2id) as m2a,
where m1id < m2id and (((m1a - m2a) is not null) or ((m2a - m1a) is not null))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM