简体   繁体   中英

LEFT JOIN WHERE RIGHT IS NULL for same table in Teradata SQL

I have a table with 51 records . The table structure looks something like below :

ack_extract_id query_id cnst_giftran_key field1 value1

Now ack_extract_ids can be 8,9. I want to check for giftran keys which are there for extract_id 9 and not there in 8.

What I tried was

            SELECT *
            FROM ddcoe_tbls.ack_flextable ack_flextable1
            INNER JOIN ddcoe_tbls.ack_main_config config
                ON ack_flextable1.ack_extract_id = config.ack_extract_id
            LEFT JOIN ddcoe_tbls.ack_flextable ack_flextable2
                ON ack_flextable1.cnst_giftran_key = ack_flextable2.cnst_giftran_key
            WHERE  ack_flextable2.cnst_giftran_key IS NULL
            AND  config.ack_extract_file_nm LIKE '%Dtl%'
                AND ack_flextable2.ack_extract_id = 8
                AND ack_flextable1.ack_extract_id = 9

But it is returning me 0 records. Ideally the left join where right is null should have returned the record for which the cnst_giftran_key is not present in the right hand side table, right ?

What am I missing here ?

When you test columns from the left-joined table in the where clause ( ack_flextable2.ack_extract_id in your case), you force that join to behave as if it were an inner join. Instead, move that test to be part of the join condition.

Then to find records where that value is missing, test for a NULL key in the where clause.

        SELECT *
        FROM ddcoe_tbls.ack_flextable ack_flextable1
        INNER JOIN ddcoe_tbls.ack_main_config config
            ON ack_flextable1.ack_extract_id = config.ack_extract_id
        LEFT JOIN ddcoe_tbls.ack_flextable ack_flextable2
            ON ack_flextable1.cnst_giftran_key = ack_flextable2.cnst_giftran_key
                AND ack_flextable2.ack_extract_id = 8
        WHERE  ack_flextable2.cnst_giftran_key IS NULL
        AND  config.ack_extract_file_nm LIKE '%Dtl%'
            AND ack_flextable1.ack_extract_id = 9
            AND ack_flextable2.cnst_giftran_key IS NULL

THIS IS NO ANSWER, JUST AN EXPLANATION

From your comment to Joe Stefanelli's answer I gather that you don't fully understand the issue with WHERE and ON in an outer join. So let's look at an example.

We are looking for all supplier's last orders, ie the order records where there is no newer order for the supplier.

select *
from order
where not exists
(
  select *
  from order newer 
  where newer.supplier = order.supplier 
    and newer.orderdate > order.orderdate
);

This is straight-forward; the query matches what we just put in words: Find orders for which NOT EXISTS a newer order for the same supplier.

The same query with the anti-join pattern:

select order.*
from order
left join order newer on  newer.supplier = order.supplier 
                      and newer.orderdate > order.orderdate
where newer.id is null;

Here we join every order with all their newer orders, thus probably creating a huge intermediate result. With the left outer join we make sure we get a dummy record attached when there is no newer order for the supplier. Then at last we scan the intermediate result with the WHERE clause, keeping only records where the attached record has an ID null. Well, the ID is obviously the table's primary key and can never be null, so what we keep here is only the outer-joined results where the newer data is just a dummy record containing nulls. Thus we get exactly the orders for which no newer order exists.

Talking about a huge intermediate result: How can this be faster than the first query? Well, it shouldn't. The first query should actually either run equally fast or faster. A good DBMS will see through this and make the same execution plan for both queries. A rather young DBMS however may really execute the anti join quicker. That is because the developers put so much effort into join techniques, as these are needed in about every query, and didn't yet care about IN and EXISTS that much. In such a case one may run into performance issues with NOT IN or NOT EXISTS and use the anti-join pattern instead.

Now as to the WHERE / ON problem:

select order.*
from order
left join order newer on newer.orderdate > order.orderdate
where newer.supplier = order.supplier
and newer.id is null;

This looks almost the same as before, but some criteria has moved from ON to WHERE. This means the outer join gets different criteria. Here is what happens: for every order find all newer orders ‐ no matter which supplier! So it is all orders of the last order date that get an outer-join dummy record. But then in the WHERE clause we remove all pairs where the supplier doesn't match. Notice that the outer-joined records contain NULL for newer.supplier, so newer.supplier = order.supplier is never true for them; they get removed. But then, if we remove all outer-joined records we get exactly the same result as with a vanilla inner join. When we put outer join criteria in the WHERE clause we turn the outer join into an inner join. So the query can be re-written as

select order.*
from order
inner join order newer on newer.orderdate > order.orderdate
where newer.supplier = order.supplier
and newer.id is null;

And with tables in FROM and INNER JOIN it doesn't matter whether the criteria is in ON or WHERE; it's rather a matter of readability, because both criteria will equally get applied.

Now we see that newer.id is null can never be true. The final result will be empty ‐ which is exactly what happened with your query.

You can try with this query:

select * from ddcoe_tbls.ack_main_config
where cnst_giftran_key not in 
  (
   select cnst_giftran_key from ddcoe_tbls.ack_main_config 
   where ack_extract_id = 8
  )  
and ack_extract_id = 9;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM