简体   繁体   中英

A better / more efficient way to write this query

I am trying to count the number of records in a table. The table is called affiliations and only has 4 columns (2 of which are foreign keys)

I want to count the number of records where the affiliated column is 0 and the business_id is related to a particular account_email.

I know how to do this query using the IN keyword, but I was wondering if there is a better or more efficient way to do this.

This is the IN version of the query:

SELECT COUNT(1) FROM affiliations 
WHERE business_id IN (
    SELECT business_id 
    FROM affiliations 
    WHERE account_email = 'address@domain.ext'
) AND affiliated = 0

I understand I could probably replace this with EXISTS:

SELECT COUNT(1) FROM affiliations 
WHERE EXISTS (
    SELECT 1 FROM affiliations 
    WHERE account_email = 'address@domain.ext'
) AND affiliated = 0

Would the statement with EXISTS work? And as previously asked, is there just a better way to do this?

Thanks in advance!

I'd use exists, but also remember to correlate the subquery to the primary table as below.

SELECT COUNT(1) FROM affiliations a
WHERE exists (
    SELECT 1
    FROM affiliations a1
    WHERE account_email = 'address@domain.ext'
    and a1.business_id=a.business_id
) AND affiliated = 0

The first query from the question with IN clause is not equivalent to the second with EXIST .

To convert the first query with IN , you must use a dependent subquery:

SELECT COUNT(1) FROM affiliations a1
WHERE EXISTS (
    SELECT 1 FROM affiliations a2
    WHERE account_email = 'address@domain.ext'
      AND a1.business_id = a2.business_id
) AND affiliated = 0

Pay attention to this condition: AND a1.business_id = a2.business_id

The above query is semantically eqivalent to your first query with IN .
Their performance are the same as well, because MySql, during the optimalization phase, converts intenrally a condition of this form:

outer_expr IN (SELECT inner_expr FROM ... WHERE subquery_where)

into this:

EXISTS (SELECT 1 FROM ... WHERE subquery_where AND outer_expr=inner_expr)

See this link for detalis: http://dev.mysql.com/doc/refman/5.0/en/subquery-optimization-with-exists.html
Pay speciall attention to discussion about NULL values and how NULL impacts the optimizer.

In short - if business_id column is declared as NOT NULL , then MySql is able to optimize these two queries.
See final conclusion (at the bottom of the page in this link):

To help the query optimizer better execute your queries, use these tips:

  • A column must be declared as NOT NULL if it really is. (This also helps other aspects of the optimizer.)

  • If you don't need to distinguish a NULL from FALSE subquery result, you can easily avoid the slow execution path. Replace a comparison that looks like this:

    outer_expr IN (SELECT inner_expr FROM ...)

    with this expression:

    (outer_expr IS NOT NULL) AND (outer_expr IN (SELECT inner_expr FROM ...))

    Then NULL IN (SELECT ...) will never be evaluated because MySQL stops evaluating AND parts as soon as the expression result is clear.

Use JOIN instead of IN. IN is terrible for performance if you're trying to match a lot of values

SELECT COUNT(1) 
FROM affiliations AS ABB2
    JOIN (SELECT business_id 
          FROM affiliations 
          WHERE account_email = 'address@domain.ext') AS ABB1
        ON ABB1.business_id = ABB2.business_id
WHERE affiliated = 0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2025 STACKOOM.COM