简体   繁体   English

选择不在另一个表中的行

[英]Select rows not in another table

SELECT DISTINCT a.value
FROM a LEFT JOIN b 
    ON a.value = b.value 
      AND (b.field IS NULL OR b.field != 'my_string');

SELECT a.value
FROM a
WHERE a.value NOT IN
    (SELECT value 
     FROM b
     WHERE b.field = 'my_string');

From what I've read, doing a LEFT JOIN is faster. 根据我的阅读,进行左联接更快。 But I've also read that DISTINCT is code smell for inefficient query. 但是我也读过DISTINCT是低效率查询的代码味道。 How do I go about determining which query performs better in a worst case scenario? 在最坏的情况下,如何确定哪个查询的性能更好?

EDIT: sorry, id is not a primary key, it's just another field. 编辑:对不起,id不是主键,它只是另一个字段。 I'll replace it with value. 我将其替换为价值。

EDIT2: Looks like everybody's hung up on my first query. EDIT2:好像每个人都挂断了我的第一个查询。 Let's say it looks like this instead. 假设它看起来像这样。 Isn't the logic the same? 逻辑不一样吗?

SELECT DISTINCT a.value
FROM a LEFT JOIN b ON a.value = b.value 
WHERE (b.field IS NULL OR b.field != 'my_string');

EDIT3: Sample fiddle. EDIT3:样本小提琴。 http://sqlfiddle.com/#!2/500ea/1 http://sqlfiddle.com/#!2/500ea/1

EDIT4: Accepted answer. EDIT4:接受的答案。 http://sqlfiddle.com/#!2/500ea/8 http://sqlfiddle.com/#!2/500ea/8

Your first join is non-sensical. 您的第一次加入是荒谬的。 It returns all a.id values in a . 它返回所有a.id中值a Remember, left join keeps all rows in the first table and matching rows in the second. 请记住, left join将所有行保留在第一个表中,并将匹配的行保留在第二个表中。 I think you intend: 我认为您打算:

SELECT a.id
FROM a LEFT JOIN
     b 
     ON a.id = b.id AND b.field = 'my_string'
WHERE b.field IS NULL;

The distinct should be unnecessary, assuming that a.id is, well, a unique id. 假设a.id是唯一的ID,则该distinct是不必要的。

An alternative is to use not exists : 一种替代方法是使用not exists

SELECT a.id
FROM a
WHERE NOT EXISTS (SELECT 1 FROM b WHERE a.id = b.id AND b.field = 'my_string');

For performance, create an index on b(id, field) . 为了提高性能,请在b(id, field)上创建一个索引。

In general, the NOT IN (SELECT ...) is more inneficient than the LEFT JOIN , because the SELECT in the IN() condition must be executed once for every row in order to perform the filter. 通常, NOT IN (SELECT ...)LEFT JOIN效率更高,因为IN()条件中的SELECT必须为每行执行一次才能执行过滤器。 For small datasets, that's not a problem, but for large data sets, that can be quite a big headache quite inefficient. 对于小型数据集,这不是问题,但是对于大型数据集,这可能是 一个很大的麻烦, 而且效率很低。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM