简体   繁体   中英

How to understand results of counting distinct values in group by and having clause

Database: version 8.0.26-17 https://www.percona.com/doc/percona-server/8.0/release-notes/Percona-Server-8.0.26-17.html

I have two queries that yield different results. I don't understand why.

1)

select eev_company_id,
count(distinct maj.dsd_prefix) as maj_cnt,
count(distinct min.dsd_prefix) as min_cnt
from ehev_most_recent as eev
inner join ekohubschema as ehs on  ehs.ehs_subcategory = eev.eev_subcategory
left join datasourcedescription as maj on maj.dsd_prefix = eev.eev_prefix and maj.dsd_type_id = 'MAJ'
left join datasourcedescription as min on min.dsd_prefix = eev.eev_prefix and min.dsd_type_id <> 'MAJ'
where ehs.ehs_category <> 'Exclusionary Factors'
group by eev.eev_company_id
having eev.eev_company_id = 'ADD53604';

result is:

+----------------+---------+---------+
| eev_company_id | maj_cnt | min_cnt |
+----------------+---------+---------+
| ADD53604       |       2 |       1 |
+----------------+---------+---------+

The second query is pretty much the same but substituted group by eev_company_id having with AND :

2)

select 
count(distinct maj.dsd_prefix) as maj_cnt,
count(distinct min.dsd_prefix) as min_cnt
from ehev_most_recent as eev
inner join ekohubschema as ehs on ehs.ehs_subcategory = eev.eev_subcategory
left join datasourcedescription as maj on maj.dsd_prefix = eev.eev_prefix and maj.dsd_type_id = 'MAJ'
left join datasourcedescription as min on min.dsd_prefix = eev.eev_prefix and min.dsd_type_id <> 'MAJ'
where ehs.ehs_category <> 'Exclusionary Factors' AND eev.eev_company_id = 'ADD53604';

This query results in:

+---------+---------+
| maj_cnt | min_cnt |
+---------+---------+
|       2 |       0 |
+---------+---------+

As you can see, the min_cnt here is 0 while for the first query it is 1. What is the reason for the difference?

If I remove ekohubschema join I get the same results: 3)

select eev_company_id,
count(distinct maj.dsd_prefix) as maj_cnt,
count(distinct min.dsd_prefix) as min_cnt
from ehev_most_recent as eev
left join datasourcedescription as maj on maj.dsd_prefix = eev.eev_prefix and maj.dsd_type_id = 'MAJ'
left join datasourcedescription as min on min.dsd_prefix = eev.eev_prefix and min.dsd_type_id <> 'MAJ'
group by eev.eev_company_id
having eev.eev_company_id = 'ADD53604'; 

+----------------+---------+---------+
| eev_company_id | maj_cnt | min_cnt |
+----------------+---------+---------+
| ADD53604       |       2 |       0 |
+----------------+---------+---------+

ekohubschema table has the following columns: ehs_category , ehs_subcategory and ehs_long_description , no company ID whatsoever, and yet it interferes with the result.

I don't see any minor datasources, only major. This is why I struggle to find out where the count 1 (for min_cnt ) comes from.

在此处输入图像描述

Please check the output of this query:

select 
   eev_company_id, 
   min.dsd_prefix
from ehev_most_recent as eev
left join datasourcedescription as min on min.dsd_prefix = eev.eev_prefix and min.dsd_type_id <> 'MAJ'
where eev.eev_company_id = 'ADD53604'; 

I think it (the output) contains at least 1 time a 1 , if not it's a bug.

I did, and I think it's a bug, see: DBFIDDLE . I reported it here: bug 106539

The bug also exists in MariaDB 10.6, see: DBFIDDLE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM