简体   繁体   中英

Intersection of two (very) big tables

I have two tables: all_users and vip_users

all_users table has a list of all users (you don't say?) in my system and it currently has around 57k records, while vip_users table has around 37k records.

Primary key in both tables is an autoincrement id field. all_users table is big in terms of attribute count (around 20, one of them is email ), while vip_users table has only (along with id ) email attribute.

I wanted to query out the "nonVip" users by doing this (with help of this question here on SO):

SELECT all_users.id, all_users.email
FROM all_users 
LEFT OUTER JOIN vip_users 
    ON (all_users.email=vip_users.email) 
WHERE vip_users.email IS NULL

And now, finally coming to the problem - I ran this query in phpmyadmin and even after 20 minutes I was forced to close it and restart httpd service as it was taking too long to complete, my server load jumped over 2 and the site (which also queries the database) became useless as it was just loading too slow. So, my question is - how do I make this query? Do I make some script and run it over night - not using phpmyadmin (is this maybe where the problem lies?), or do I need to use different SQL query?

Please help with your thoughts on this.

Try indexing the fields email on both tables, that should speed up the query

CREATE INDEX useremail ON all_users(email)

CREATE INDEX vipemail ON vip_users(email)

I think NOT IN is faster and used less resource than LEFT OUTER JOIN .

Can you try -

SELECT *
FROM all_users
WHERE id NOT IN (SELECT id 
                 FROM vip_users
                 WHERE email IS NULL);

As written, you're not getting the results you're looking for. You're looking for vip_users rows where the email matches an email in users, and is also NULL.

Is there a reason you want vip_users to have a separate id from users? If you change the vip_users id field to a fk on the users id field, yo would then change your select to:

SELECT all_users.id, all_users.email
FROM all_users 
LEFT OUTER JOIN vip_users 
    ON (all_users.id=vip_users.id) 
WHERE vip_users.email IS NULL;

There's no reason this query should take any discernible about of time. 37k records is not a very big table....

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM