[英]SQL to count records that have no match in one-to-many relationship
我有两个MySQL表:
调查(日期,位置,公里)主键:日期+位置(每次调查一个记录)
标本(日期,位置,种类)(每个调查日期和位置有零个或多个记录)
我想找到样本数量不包含特定物种记录的调查计数和被调查的公里总数。 换句话说,没有找到特定物种的调查数量。
调查总数为:
select count(date) as surveys, sum(kilometers) as KM_surveyed
from surveys;
+---------+-------------+
| surveys | KM_surveyed |
+---------+-------------+
| 20141 | 40673.59 |
+---------+-------------+
查找没有发现标本的调查数量很容易:
select count(s.date) as surveys, sum(s.kilometers) as KM_surveyed
from surveys=s left join specimens=p
on (s.date=p.date and s.location=p.location)
where p.date is null;
+---------+-------------+
| surveys | KM_surveyed |
+---------+-------------+
| 8820 | 15848.26 |
+---------+-------------+
标本中的记录总数为:
select count(*) from specimens;
+-----------+
| count(*) |
+-----------+
| 51566 |
+-----------+
在所有调查中找到的正确数量的海豹(HASE)是:
select count(*) from specimens where species = 'HASE';
+-----------+
| count(*) |
+-----------+
| 662 |
+-----------+
查找发现海豹突击队(HASE)的调查数量并不容易。
由于标本表通常每个调查包含多个记录,因此此查询不返回调查数量,而是返回找到的HASE标本数量:
select count(s.date), sum(s.kilometers)
from surveys=s
left join specimens=p on (s.date=p.date and s.location=p.location)
where p.species = 'HASE';
+---------+-------------+
| surveys | KM_surveyed |
+---------+-------------+
| 662 | 2030.70 | WRONG! that is number of specimens not surveys
+---------+-------------+
查找没有发现海豹(HASE)的调查数量也不容易。 此查询不是返回调查数量,而是返回发现的不是海豹突击队的标本数量:
select count(s.date), sum(s.kilometers)
from surveys=s
left join specimens=p on (s.date=p.date and s.location=p.location)
where p.species <> 'HASE' or p.date is null;`
+---------+-------------+
| surveys | KM_surveyed |
+---------+-------------+
| 50904 | 151310.49 |
+---------+-------------+
错误! 50904 =非HASE标本的数量
如何构造查询以正确计算发现海豹突击队的调查数量和未发现海豹的调查数量?
当您执行LEFT JOIN
查找不匹配的行时,应将不应匹配的条件放在ON
子句中,而不是WHERE
子句中。
SELECT COUNT(*), SUM(s.kilometers)
FROM surveys AS s
LEFT JOIN specimens AS p ON s.date = p.date and s.location = p.location
AND p.species = 'HASE'
WHERE p.date IS NULL
您可以在WHERE子句中使用EXISTS
/ NOT EXISTS
子查询。
在specimens
表中发现HASE
调查:
select count(*), sum(s.kilometers)
from surveys s
where exists (
select *
from specimens p
where s.date=p.date
and s.location=p.location
and p.species = 'HASE'
)
在specimens
表中未找到HASE
调查:
select count(*), sum(s.kilometers)
from surveys s
where not exists (
select *
from specimens p
where s.date=p.date
and s.location=p.location
and p.species = 'HASE'
)
第一个查询的替代方法可能是:
select count(*), sum(s.kilometers)
from (
select distinct date, location
from specimens
where species = 'HASE'
) p
join surveys s using (date, location)
根据数据(如果“ HASE”是罕见的“物种”),它可能会更快。
Barmar已经发布了第二个查询的最佳选择。
人们为什么很难找到加入者?
查找发现海豹突击队(HASE)的调查数量:
select count(distinct concat(s.location, s.date))
from surveys s
Inner join specimens p
on (s.date=p.date and s.location=p.location)
where p.species = 'HASE';
查找未找到“海豹突击队”(HASE)的调查数量仅是调查数量(您已经拥有)与上方值之间的差。 由于两个查询都返回单个行,因此查询的笛卡尔乘积将在单个输出行中给出一个值,但会有所不同:
Select count(*), sum(kilometres)
From (
Select kilometres
From surveys s
Left join specimens p
on (s.date=p.date and s.location=p.location)
and p.species = 'HASE'
Where p.species is null
) As zero_surveys;
(上面还有其他几种写查询的方法)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.