简体   繁体   中英

SQL WHERE statements, where column may be value or may be NULL

I have an SQL table (in SQLite3) in which I am trying to aggregate information from several other tables, and records in one table may or may not have a corresponding record in another table. My query is supposed to include in the aggregate table both records with and without linked information. For example:

            CREATE TABLE all_households AS
                SELECT pop.uid AS pop_uid,
                       pop.surname,
                       pop.given,
                       pop.age,
                       pop.real_property,

                       farm.uid AS farm_uid,
                       farm.improved_acres,
                       farm.unimproved_acres,
                       farm.cash_value,
                       farm.corn,
                       farm.cotton

                       FROM pop, farm
                       WHERE pop.farm_id = farm.uid;

This is looking at data from census schedules. Everybody in the census will have the basic pop information -- surname, given name, value of real property -- but not everybody has a farm. Only certain individuals have a value in the farm_id column on pop , corresponding to the record of that person's farm on farm ; otherwise farm_id is NULL.

But naturally, the above query will fetch only those individuals for whom pop.farm_id = farm.uid -- that is, who have farms, and have values for farm_id . The farmless individuals are excluded, and I want to include them, with empty values for the relevant farm columns in all_households .

Now, I know I could solve this, and have so far, with separate SELECT statements for each linked column, like so:

            CREATE TABLE all_households AS
                SELECT uid AS pop_uid,
                       surname,
                       given,
                       age,
                       real_property,

                       (SELECT uid FROM farm WHERE pop.farm_id = farm.uid) AS farm_uid,
                       (SELECT improved_acres FROM farm WHERE pop.farm_id = farm.uid) AS improved_acres,
                       (SELECT unimproved_acres FROM farm WHERE pop.farm_id = farm.uid) AS unimproved_acres,
                       (SELECT cash_value FROM farm WHERE pop.farm_id = farm.uid) AS cash_value,
                       (SELECT corn FROM farm WHERE pop.farm_id = farm.uid) AS corn,
                       (SELECT cotton FROM farm WHERE pop.farm_id = farm.uid) AS cotton

                       FROM pop;

But this seems terribly clunky and inelegant. So, I wondered if there was a way to make the first query above pick up entries from pop where farm_id was NULL:

            WHERE pop.farm_id = farm.uid OR pop.farm_id IS NULL;

But then things went very haywire, and I'm not sure why. In my real, unsimplified query, I'm actually dealing with four tables, each with a column on pop that may be a value or may be NULL, and though the first query above as written took only seconds, the query with this WHERE hung. Forever. And when I came back, it had died with the error that "database or disk is full." So whatever I did, I seem to have elicited some kind of endless loop. I tried alternately:

            WHERE (CASE WHEN pop.farm_id IS NOT NULL THEN pop.farm_id = farm.uid ELSE 1 END);

But this had the same result as before. Can anybody shed any light on what I'm doing wrong, or what I might do better? Thanks.

Your attempt to use farm_id IS NULL was slow because the database attempted to give you the combination of each farm record with each pop record with the NULL value. Furthermore, optimizing constraints with OR is not easy and was done with a temporary table.

To get all matched/joined records, and all records from the first table with no corresponding farm, combine two queries with UNION ALL :

SELECT pop. ..., farm. ...
FROM pop JOIN farm ON pop.farm_id = farm.uid

UNION ALL

SELECT pop. ..., NULL, NULL, ...
FROM pop
WHERE pop.farm_id IS NULL

This construct is called an outer join and is supported directly in most SQL databases (SQLite supports only left joins, which is what you want here):

SELECT pop. ..., farm. ...
FROM pop LEFT OUTER JOIN farm ON pop.farm_id = farm.uid

Please note that an outer join actually returns all unmatched records, so this will also return pop records with an invalid farm_id .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM