简体   繁体   English

'in'和'not in'计数不加起来 - 出了什么问题?

[英]'in' and 'not in' counts do not add up - what's wrong?

I have some servers. 我有一些服务器。 Some of them have ips assigned. 他们中的一些人分配了ips。 I want to figure out how many do not. 我想弄清楚有多少没有。 There are clearly more servers than have ips assigned, but my db tells me there are no servers that have no ips assigned... 显然有更多的服务器比分配的ips,但我的数据库告诉我没有分配ips的服务器...

I'm at my wit's end here. 我在这里结束了我的智慧。 Is my DB corrupted in some strange way? 我的数据库是否以某种奇怪的方式被破坏了?

SELECT COUNT(*) 
  FROM server

...returns: ...的回报:

+----------+
| count(*) |
+----------+
|    23088 | 
+----------+
1 row in set (0.00 sec)

This: 这个:

SELECT COUNT(*) 
  FROM server 
 WHERE server_id IN (SELECT DISTINCT(server_id) 
                       FROM ips)

...returns: ...的回报:

+----------+
| count(*) |
+----------+
|    13811 | 
+----------+
1 row in set (0.01 sec)

This: 这个:

SELECT COUNT(*) 
  FROM server 
 WHERE server_id NOT IN (SELECT DISTINCT(server_id) 
                           FROM ips);

...returns: ...的回报:

+----------+
| count(*) |
+----------+
|        0 | 
+----------+
1 row in set (0.02 sec)

Results have been edited to protect the guilty, but you get the idea. 结果已被编辑以保护有罪,但你明白了。

  • All tables are InnoDB. 所有表都是InnoDB。
  • Check table returns ok on both of these tables. Check table在这两个表上都返回ok。

EDIT: thank you for the suggestion of using LEFT JOIN . 编辑:谢谢你提出使用LEFT JOIN的建议。 This definitely confirms that the problem is the MySQL bug. 这肯定证实问题是MySQL错误。

mysql> SELECT count(s.server_id) FROM server s LEFT JOIN ips i on s.server_id = i.server_id WHERE i.server_id IS NULL;
+--------------------+
| count(s.server_id) |
+--------------------+
|               9277 | 
+--------------------+
1 row in set (0.04 sec)

Since 9277 + 13811 = 23088, this means that all servers without ips + all servers with ips does indeed == all servers. 由于9277 + 13811 = 23088,这意味着所有没有ips的服务器+所有带有ips的服务器确实==所有服务器。

I've scheduled an upgrade of the mysql server for beginning of next week. 我计划在下周开始升级mysql服务器。 Stay tuned. 敬请关注。

What version of MySQL? 什么版本的MySQL? There seems to be a bug in < 5.0.25 / 5.1.12 that might be the culprit: 似乎<5.0.25 / 5.1.12中的错误可能是罪魁祸首:

Bug #21282 : NOT IN, more than 1000 returns incorrect results with INDEX : 错误#21282NOT IN,超过1000返回错误的结果与INDEX

Using a SELECT ... WHERE some_field NOT IN (...) and then 1000 or more values in the NOT IN part causes the server to return incorrect results if there is an INDEX/UNIQUE key on some_field. 使用SELECT ... WHERE some_field NOT IN (...)然后在NOT IN部分中的1000或更多值导致服务器返回错误的结果,如果some_field上有INDEX / UNIQUE键。 Less than 1000 criteria works correctly. 少于1000个标准正常工作。

你的专栏里有空吗?

The server_id not in (ids) does not match NULL columns, so you only get the servers with a non-NULL server_id that isn't among those in ips . server_id not in (ids)中的server_id not in (ids)NULL列不匹配,因此您只能获得具有非NULL server_id的服务器,该服务器不在ips You'll want to use where server_id is null instead. 您将要使用where server_id is null

Assuming the bug truppo found causes this, you could use this workaround: 假设发现truppo的bug导致了这种情况,您可以使用此解决方法:

select count(*)
from server s
left join ips i on i.server_id = s.server_id
where i.server_id is null

Above, i.server_id is null is true if the left join did not find a match (just like all columns from i would yield null for that situation). 上面,如果left join没有找到匹配,则i.server_id is null为真(就像i所有列都会为该情况产生null )。

Do you have any record that has a NULL for server_id ? 你有任何对于server_id有NULL的记录吗? Because it would be excluded in both case. 因为在这两种情况下它都会被排除在外。

if you have NULL in your columns, they will evaluate to false in both cases. 如果列中包含NULL,则在两种情况下它们都将计算为false。 the result you are getting is in + not in - nulls 你得到的结果是in + not in - nulls

select count(*) 
from server 
where server_id not in (select distinct(server_id) from ips)
or server_id is NULL

I would assume that there is something strange going on with the IN and NOT IN. 我会假设IN和NOT IN有一些奇怪的事情发生。 Might be a bug or a "known limitation". 可能是一个错误或“已知限制”。

I'd suggest to first try to answer your initial question (servers without an ip) and then have a look at the data .. maybe that gives you an indication on what might be going on. 我建议首先尝试回答你的初始问题(没有ip的服务器),然后查看数据......或许可以告诉你可能发生的事情。

So here are some alternative ideas to give you what you are looking for: 所以这里有一些替代的想法,以满足您的需求:

SELECT server_id
FROM server
MINUS
SELECT server_id
FROM ips

Or 要么

SELECT server_id
FROM server s LEFT JOIN ips i on s.server_id = i.server_id
WHERE i.server_id is null

As said above, this may give you an idea on why the data is not "caught" by your original statements. 如上所述,这可以让您了解数据未被原始语句“捕获”的原因。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM