简体   繁体   English

MySQL-以一对多关系选择不匹配的数据

[英]MySQL - Select un-matching data in a one to many relationship

First of all, this question is in regards to PHP and MySQL 首先,这个问题是关于PHP和MySQL的

I have two database tables: 我有两个数据库表:

The People table: People表:

person_id   |   field_1  |  field_2  |  other_fields... 

And the Notes table: Notes表:

note_id   |  person_id  |  created_timestamp  |  other_fields... 

The People table has a one to many relationship with the Notes table... People表与Notes表具有一对多关系...
Everytime a note is created in the Notes table, a timestamp is attached to it, also a person_id foreign key is assigned. 每次在Notes表中创建一个Notes ,都会为其附加一个时间戳,并分配一个person_id外键。

Now... 现在...
I need to find all people who haven't had a note against them in the last 30 days. 我需要找到所有people谁没有过一note对他们在过去的30天。
The way I do it now is: 我现在的方式是:

  1. Get all notes from the Notes table with a distinct person_id and a created_timestamp > 'time(31*86400)' (not precise.. I Know, but suits my needs) 使用person_idcreated_timestamp > 'time(31*86400)'Notes表中获取所有笔记( person_id 。我知道,但适合我的需要)
  2. Loop through the results and add the person_id to a temporary array $temp 遍历结果并将person_id添加到临时数组$temp
  3. Get all records from the People table People表中获取所有记录
  4. Loop through each record and do an in_array() comparison of the person_id with $temp 遍历每条记录,并使用$tempperson_id进行in_array()比较

This isn't very efficient and cripples the application when there are a lot of People or Notes . 这不是很有效,并且在有很多PeopleNotes时会破坏应用程序。

Has anyone got a better solution to this. 有谁对此有更好的解决方案。 Ideally something that can be achieved using just one SQL query. 理想情况下,仅使用一个SQL查询就可以实现。

Thanks for looking 谢谢看

SELECT person_id FROM People WHERE person_id NOT IN 
    (SELECT person_id FROM Note 
        WHERE created_timestamp > DATE_SUB(CURDATE(), INTERVAL 30 DAY))

This assumes that create_timestamp is of the type "DATE", "TIMESTAMP" or "DATETIME". 假定create_timestamp的类型为“ DATE”,“ TIMESTAMP”或“ DATETIME”。 If you use a unix timestamp here convert it to a MySQL Timestamp using FROM_UNIXTIME(created_timestamp) 如果您在此处使用unix时间戳,请使用FROM_UNIXTIME(created_timestamp)将其转换为MySQL时间戳。

Standard solution is to use a sub-query of this form: 标准解决方案是使用这种形式的子查询:

Select * from people where PersonID NOT in 
  (select PersonID from Notes where Created_Timestamp>...)

Another option is to do a right outer join on Notes and filter only for Notes.PersonID IS NULL which only gives you the rows that don't match on Notes. 另一种选择是在Notes上进行右外部联接,并仅对Notes.PersonID IS NULL进行过滤,这只会为您提供在Notes上不匹配的行。

Personally I prefer the sub-query method above which should run fairly efficiently and is easier to understand than the outer join solution. 就我个人而言,我更喜欢上面的子查询方法,该方法应该可以高效运行,并且比外部联接解决方案更容易理解。

I believe this should work: 我相信这应该有效:

SELECT * FROM people WHERE person_id NOT IN (SELECT DISTINCT person_id FROM notes); SELECT * FROM不在person_id中的人(从注释中选择DISTINCT person_id);

If this is critical, you could consider a denormalization: storing the timestamp of the last note in the user table, and indexing that column. 如果这很关键,则可以考虑进行非规范化:将最后一个音符的时间戳存储在用户表中,并对该列建立索引。

Otherwise, there's no way to avoid traversing the entire table of people, so add an index on the (person_id, timestamp) pair of the note table and use a left join or subquery: 否则,无法避免遍历整个人员表,因此请在注释表的(person_id,timestamp)对上添加索引,并使用左联接或子查询:

SELECT * FROM people 
         LEFT JOIN notes ON people.person_id = notes.person_id
                        AND notes.created_timestamp < NOW() - INTERVAL 30 DAY
WHERE notes.person_id IS NULL

SELECT * FROM people
WHERE person_id NOT IN (SELECT person_id FROM notes
                        WHERE created_timestamp < NOW() - INTERVAL 30 DAY)

LEFT JOIN/IS NULL 左联接/为空

   SELECT p.*
     FROM PEOPLE p
LEFT JOIN NOTES n ON n.person_id = p.person_id
                 AND n.created_timestamp >= DATE_SUB(NOW(), INTERVAL 30 DAY)
    WHERE n.note_id IS NULL

NOT EXISTS 不存在

SELECT p.*
  FROM PEOPLE p
 WHERE NOT EXISTS(SELECT NULL
                    FROM NOTES n
                   WHERE n.person_id = p.person_id
                     AND n.created_timestamp >= DATE_SUB(NOW(), INTERVAL 30 DAY))

NOT IN 不在

SELECT p.*
  FROM PEOPLE p
 WHERE p.person_id NOT (SELECT n.person_id
                          FROM NOTES n
                         WHERE n.created_timestamp >= DATE_SUB(NOW(), INTERVAL 30 DAY))

Conclusion 结论

The LEFT JOIN IS NULL is the most efficient on MySQL when the columns compared are not nullable . 当比较的列不可为空时,LEFT JOIN IS NULL在MySQL上效率最高 If the columns compared were nullable, NOT IN and NOT EXISTS are more efficient . 如果比较的列可以为空,则NOT IN和NOT EXISTS更为有效

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM