简体   繁体   English

如果列值不在不同列的值的组中,则SQL选择行

[英]SQL select rows if a column value is not in a group of a different column's values

For each identifier , how can I return the quantity when the received country is not equal to any of the delivered countries? 对于每个标识符 ,当收到的国家/地区不等于任何已交付的国家/地区时,如何返回数量 I need an efficient query for the steps below since my table is huge. 因为我的桌子很大,所以我需要对下面的步骤进行有效的查询。

These are the steps I would think could do this, of course you don't need to follow them :) 这些是我认为可以做到的步骤,当然你不需要遵循它们:)

  1. Create a group of 'delivered' countries for each identifier. 为每个标识符创建一组“已交付”国家/地区。
  2. See if 'received' is any of these countries for each identifier. 查看每个标识符的“已收到”是否属于这些国家/地区。 If there is no match, return this result. 如果没有匹配项,则返回此结果。

Starting Table: 起始表:

identifier         delivered            received        quantity
-------------      ------------         -----------     ------------
1                  USA                  France          432
1                  France               USA             450
1                  Ireland              Russia          100
2                  Germany              Germany         1,034
3                  USA                  France          50
3                  USA                  USA             120

Result: 结果:

identifier         delivered            received        quantity
-------------      ------------         -----------     ------------
1                  Ireland              Russia          100 

The starting table is about 30,000,000 rows, so self-joins will be impossible unfortunately. 起始表大约是30,000,000行,所以不幸的是,自连接是不可能的。 I am using something similar to MySQL. 我正在使用类似于MySQL的东西。

I think LEFT JOIN query should work for you: 我认为LEFT JOIN查询应该适合你:

SELECT a.*
FROM starting a
     LEFT JOIN starting b
        ON a.id = b.id
           AND a.delivered = b.received
WHERE b.received IS NULL;

Example: SQLFiddle 示例: SQLFiddle

For optimizing above query, adding following composite index should give you better performance: 为了优化上述查询,添加以下复合索引应该会为您提供更好的性能:

ALTER TABLE starting  ADD KEY ix1(id, delivered, received);

You could use a not exists subquery: 您可以使用not exists子查询:

SELECT  a.*
FROM    starting a
WHERE   NOT EXISTS
        (
        SELECT  *
        FROM    starting b
        WHERE   a.id = b.id
                AND a.delivered = b.received
        )

This is not a self-join, but the query optimizer is free to execute it as one (and usually does.) 这不是一个自连接,但查询优化器可以自由地执行它(并且通常会这样做)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM