简体   繁体   中英

MySQL NOT IN with subquery not working as expected

I'm creating a application that will generate lists for email marketing campaigns. I have tables for contacts, emails, and campaigns. A campaign has many emails and a contact has many emails. The email is related to a contact and a campaign. Basically a table for a MANY to MANY relationship except I have other fields in the table for the result of the email (clicked, opened, unsubscribed, etc). There are also other tables but this is where I'm having the trouble.

I'm trying to use NOT IN with a subquery to get a list of contacts who have not received an email since a certain date with other conditions. An example query is this:

SELECT * 
FROM `contact` `t` 
WHERE (unsubscribed='1')
  AND t.id NOT IN 
   (SELECT distinct contact_id 
    FROM email, campaign 
    WHERE email.campaign_id = campaign.id 
      AND campaign.date_sent >= '2012-07-12') 
ORDER BY rand() 
LIMIT 10000

This returns 0 result. However, if I run the first condition:

select id 
from contact 
where unsubscribed=1

I have 9075 rows. Then, if I separately run the subquery:

SELECT distinct contact_id 
FROM email, campaign 
WHERE email.campaign_id = campaign.id 
  AND campaign.date_sent >= '2012-07-12'

I have 116612 rows. Out of each of those results, I end up with 826 values that are duplicates. From what I can understand, this means that 9075-826=8249 records ARE unsubscribed=1 AND NOT IN the second query. So, my first query should be returning 8249 results but it is returning 0. I must be structuring the query wrong or using the wrong operators but I can not for the life of me figure out how to get this right.

Can anyone help? So many thanks in advance as this has had me stumped for like 3 days! :)

This is because

SELECT 1 FROM DUAL WHERE 1 NOT IN (NULL, 2) 

won't return anything, whereas

SELECT 1 FROM DUAL WHERE 1 NOT IN (2)

will.

Please review the behaviour of NOT IN and NULL in MYSQL.

For your concern you should get away with it using NOT EXISTS instead of NOT IN :

SELECT * FROM `contact` `t` 
WHERE (unsubscribed='1')
AND NOT EXISTS (
    SELECT * FROM email, campaign 
    WHERE 
        email.campaign_id = campaign.id 
    AND campaign.date_sent >= '2012-07-12'
    AND t.id = contact_id
) 
ORDER BY rand() 
LIMIT 10000

Just wasted a few hours and a few hairs on this as well.

Was not able to get "NOT exists" to work as the accepted answer mentioned. However, you can simply throw in a NOT NULL in

WHERE "field you are aggregating is not NULL ", and that did the job.

SELECT * 
FROM `contact` `t` 
WHERE (unsubscribed='1')
  AND t.id NOT IN 
   (SELECT distinct contact_id 
    FROM email, campaign 
    WHERE email.campaign_id = campaign.id 
      AND campaign.date_sent >= '2012-07-12'
      AND contact_id is not NULL          ###*************added line
) 
ORDER BY rand() 
LIMIT 10000
select c.*, e.id from contact as c 
left join email as e on c.id = e.contact_id and e.date_sent >= '2012-07-12' 
where e.id is null and c.unsubscribed = 1

I think campaign.date_sent was a typo? It must be email.date_sent?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM