简体   繁体   中英

Filter values based on another column in SQL

I have a table with purchase orders, the orders' lines and a code for each line

Order_ID LINE CODE
A0001 1 aaaa
A0002 1 bbbb
A0002 2 xxxx
A0003 1 cccc
A0004 1 xxxx
A0004 2 dddd

And I need to filter out all the Orders that have at least one line with the code 'xxxx':

Order_ID LINE CODE
A0001 1 aaaa
A0003 1 cccc

I thought something like this:

SELECT *
FROM MyTable
WHERE Order_ID not in (SELECT * FROM MyTable WHERE CODE = 'xxxx')

BUT, the big problem here is that I'm working with a pretty big query so the subquery is also too large and the whole query takes a lot to run. Is there any workaround to avoid the subquery?

The column Order_ID should be compared with itself, not with any column ( * ):

SELECT *
FROM MyTable
WHERE Order_ID not in (SELECT Order_ID FROM MyTable WHERE CODE = 'xxxx')

Beyond the issue others have pointed out with your syntax, the big problem is the subquery being called for each row. I've done a test on a similar big table with a non-unique field (like your Order_ID ), and found very large savings by structuring the query with a CTE:

WITH bad_ids AS (
  SELECT DISTINCT Order_ID
  FROM MyTable
  WHERE CODE = 'xxxx'
)

SELECT *
FROM MyTable m
LEFT JOIN bad_ids b
  ON m.Order_ID = b.Order_ID
WHERE b.Order_ID IS NULL;

Here I'm assuming Order_ID is not allowed to be null.

Given the huge difference I saw in performance, I'd be surprised if you didn't see the same, no matter what the differences are between query planners.

Edit : Essentially this works because the above query is searching through the table twice and joining once. Your query is searching some subset of the table as many times as you have rows.

SELECT *
FROM MyTable
WHERE CODE <> 'xxxx'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM