繁体   English   中英

SQL对一对多关系中不匹配的记录进行计数

[英]SQL to count records that have no match in one-to-many relationship

我有两个MySQL表:

调查(日期,位置,公里)主键:日期+位置(每次调查一个记录)

标本(日期,位置,种类)(每个调查日期和位置有零个或多个记录)

我想找到样本数量不包含特定物种记录的调查计数和被调查的公里总数。 换句话说,没有找到特定物种的调查数量。

调查总数为:

select count(date) as surveys, sum(kilometers) as KM_surveyed 
from surveys;

 +---------+-------------+
 | surveys | KM_surveyed |
 +---------+-------------+
 |   20141 |    40673.59 |
 +---------+-------------+

查找没有发现标本的调查数量很容易:

select count(s.date) as surveys, sum(s.kilometers) as KM_surveyed 
from surveys=s left join specimens=p 
on (s.date=p.date and s.location=p.location)
where p.date is null;

 +---------+-------------+
 | surveys | KM_surveyed |
 +---------+-------------+
 |    8820 |    15848.26 |
 +---------+-------------+

标本中的记录总数为:

select count(*) from specimens;

+-----------+
|  count(*) |
+-----------+
|     51566 |
+-----------+ 

在所有调查中找到的正确数量的海豹(HASE)是:

select count(*) from specimens where species = 'HASE';

 +-----------+
 | count(*)  |
 +-----------+
 |       662 |
 +-----------+

查找发现海豹突击队(HASE)的调查数量并不容易。
由于标本表通常每个调查包含多个记录,因此此查询不返回调查数量,而是返回找到的HASE标本数量:

select count(s.date), sum(s.kilometers) 
from surveys=s 
left join specimens=p on (s.date=p.date and s.location=p.location) 
where p.species = 'HASE';

 +---------+-------------+
 | surveys | KM_surveyed |
 +---------+-------------+
 |     662 |     2030.70 |  WRONG! that is number of specimens not surveys 
 +---------+-------------+

查找没有发现海豹(HASE)的调查数量也不容易。 此查询不是返回调查数量,而是返回发现的不是海豹突击队的标本数量:

select count(s.date), sum(s.kilometers) 
from surveys=s 
left join specimens=p on (s.date=p.date and s.location=p.location) 
where p.species <> 'HASE' or p.date is null;`

 +---------+-------------+
 | surveys | KM_surveyed |
 +---------+-------------+
 |   50904 |   151310.49 | 
 +---------+-------------+

错误! 50904 =非HASE标本的数量

如何构造查询以正确计算发现海豹突击队的调查数量和未发现海豹的调查数量?

当您执行LEFT JOIN查找不匹配的行时,应将不应匹配的条件放在ON子句中,而不是WHERE子句中。

SELECT COUNT(*), SUM(s.kilometers)
FROM surveys AS s
LEFT JOIN specimens AS p ON s.date = p.date and s.location = p.location
    AND p.species = 'HASE'
WHERE p.date IS NULL

您可以在WHERE子句中使用EXISTS / NOT EXISTS子查询。

specimens表中发现HASE调查:

select count(*), sum(s.kilometers)
from surveys s
where exists (
    select *
    from specimens p
    where s.date=p.date
      and s.location=p.location
      and p.species = 'HASE'
)

specimens表中未找到HASE调查:

select count(*), sum(s.kilometers)
from surveys s
where not exists (
    select *
    from specimens p
    where s.date=p.date
      and s.location=p.location
      and p.species = 'HASE'
)

第一个查询的替代方法可能是:

select count(*), sum(s.kilometers)
from (
    select distinct date, location
    from specimens
    where species = 'HASE'
) p
join surveys s using (date, location)

根据数据(如果“ HASE”是罕见的“物种”),它可能会更快。

Barmar已经发布了第二个查询的最佳选择。

人们为什么很难找到加入者?

查找发现海豹突击队(HASE)的调查数量:

select count(distinct concat(s.location, s.date))
from surveys s 
Inner join specimens p 
on (s.date=p.date and s.location=p.location) 
where p.species = 'HASE';

查找未找到“海豹突击队”(HASE)的调查数量仅是调查数量(您已经拥有)与上方值之间的差。 由于两个查询都返回单个行,因此查询的笛卡尔乘积将在单个输出行中给出一个值,但会有所不同:

Select count(*), sum(kilometres)
From (
  Select kilometres
  From surveys s
  Left join specimens p 
  on (s.date=p.date and s.location=p.location) 
  and p.species = 'HASE'
  Where p.species is null
) As zero_surveys;

(上面还有其他几种写查询的方法)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM