Why mysql select count(distinct user_id) return wrong number？

Question

I have a big table in mysql.It has 13 million rows.

Mysql version is 5.7.10.

Table structure as below:

create table table_name (    
  user_id varchar(20) not null,    
  item_id varchar(20) not null 
);

1. The first sql is:

select count(distinct user_id) from table;

result:760,000

2. The second sql is:

select count(1) from (select user_id from table group by user_id) a;

result:120,000

user_id is not null for each row.

And, the right number is 120,000.Why the first sql get the wrong number?

Then,I run the first sql in hive and spark-sql, the result is 120,000.

So, is this a mysql's bug or something can be setting to make things right?

Thank you!

Update :I try it on another PC, the result of first sql is 120,000.This time get the right number.Mysql version is 5.6.26. So, maybe it is a bug of 5.7.10.

Answer 1

There are multiple known bugs in MySQL count distinct when a column is included in two-column unique key.

here and here

Why mysql select count(distinct user_id) return wrong number？

Question

1 answers

solution1
1 2017-03-03 14:20:01

Why mysql select count(distinct user_id) return wrong number？

Question

1 answers

solution1 1 2017-03-03 14:20:01

solution1
1 2017-03-03 14:20:01