简体   繁体   English

MySQL SELECT返回错误的结果

[英]MySQL SELECT return wrong results

I'm working with MySQL 5.7. 我正在使用MySQL 5.7。 I created a table with a virtual column (not stored) of type DATETIME with an index on it. 我创建了一个带有DATETIME类型的虚拟列(未存储)的表,其中包含索引。 While I was working on it, I noticed that order by was not returning all the data (some data I was expecting at the top was missing). 当我正在研究它时,我注意到order by并没有返回所有数据(我在顶部期待的一些数据丢失了)。 Also the results from MAX and MIN were wrong. MAX和MIN的结果也是错误的。 After I run 我跑完之后

ANALYZE TABLE 
CHECK TABLE
OPTIMIZE TABLE

then the results were correct. 然后结果是正确的。 I guess there was an issue with the index data, so I have few questions: 我想索引数据存在问题,所以我几乎没有问题:

  1. When and why this could happen? 何时以及为何会发生这种情况?
  2. Is there a way to prevent this? 有办法防止这种情况吗?
  3. among the 3 command I run, which is the correct one to use? 在我运行的3个命令中,哪个是正确使用的?

I'm worried that this could happen in the future but I'll not notice. 我担心将来会发生这种情况,但我不会注意到。

EDIT : 编辑

as requested in the comments I added the table definition: 根据评论中的要求,我添加了表定义:

CREATE TABLE `items` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `user_id` bigint(20) unsigned DEFAULT NULL,
  `image` json DEFAULT NULL,
  `status` json DEFAULT NULL,
  `status_expired` tinyint(1) GENERATED ALWAYS AS (ifnull(json_contains(`status`,'true','$.expired'),false)) VIRTUAL COMMENT 'used for index: it checks if status contains expired=true',
  `lifetime` tinyint(4) NOT NULL,
  `expiration` datetime GENERATED ALWAYS AS ((`create_date` + interval `lifetime` day)) VIRTUAL,
  `last_update` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `create_date` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  KEY `user_id` (`user_id`),
  KEY `expiration` (`status_expired`,`expiration`) USING BTREE,
  CONSTRAINT `ts_competition_item_ibfk_2` FOREIGN KEY (`user_id`) REFERENCES `ts_user_core` (`user_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1312459 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=COMPRESSED

Queries that were returning the wrong results: 返回错误结果的查询:

SELECT * FROM items ORDER BY expiration DESC;
SELECT max(expiration),min(expiration) FROM items;

Thanks 谢谢

TLDR; TLDR;

The trouble is that your data comes from virtual columns materialized via indexes. 问题是您的数据来自通过索引实现的虚拟列。 The check, optimize, analyze operations you are doing forces the indexes to be synced and fixes any errors. 检查,优化,分析您正在执行的操作会强制同步索引并修复任何错误。 That gives you the correct results henceforth. 从而为您提供正确的结果。 At least until the index gets out of sync again. 至少在索引再次失去同步之前。

Why it may happen 为什么会这样

Much of the problems are caused by issues with your table design. 许多问题都是由您的表格设计问题引起的。 Let's start with. 让我们开始吧。

`status_expired` tinyint(1) GENERATED ALWAYS AS (ifnull(json_contains(`status`,'true','$.expired'),false)) VIRTUAL

No doubt this is created to overcome the fact that you cannot directly index a JSON column in mysql. 毫无疑问,这是为了克服你不能直接索引mysql中的JSON列这一事实而创建的。 You have created a virtual column and indexed that instead. 您已创建了一个虚拟列,并将其编入索引。 It's all very well, but this column can hold only one of two values; 这一切都很好,但是这个列只能包含两个值中的一个; true or false . truefalse Which means it has very poor cadinality. 这意味着它的基数很差。 As a result, mysql is unlikely to use this index for anything. 因此,mysql不太可能将此索引用于任何事情。

But we can see that you have combined the status_expired column with the expired column when creating the index. 但是我们可以看到在创建索引时已将status_expired列与expired列组合在一起。 Perhaps with the idea of overcoming this poor cardinality mentioned above. 也许是为了克服上面提到的这种可怜的基数。 But wait... 可是等等...

`expiration` datetime GENERATED ALWAYS AS ((`create_date` + interval `lifetime` day)) VIRTUAL,

Expiration is another virtual column. 到期是另一个虚拟列。 This has some repercussions. 这会产生一些影响。

When a secondary index is created on a generated virtual column, generated column values are materialized in the records of the index. 在生成的虚拟列上创建辅助索引时,生成的列值将在索引的记录中实现。 If the index is a covering index (one that includes all the columns retrieved by a query), generated column values are retrieved from materialized values in the index structure instead of computed “on the fly”. 如果索引是覆盖索引(包括查询检索的所有列的索引),则从索引结构中的具体化值中检索生成的列值,而不是“在运行中”计算。

Ref: https://dev.mysql.com/doc/refman/5.7/en/create-table-secondary-indexes.html#json-column-indirect-index 参考: https//dev.mysql.com/doc/refman/5.7/en/create-table-secondary-indexes.html#json-column-indirect-index

This is contrary to 这与之相反

VIRTUAL: Column values are not stored, but are evaluated when rows are read, immediately after any BEFORE triggers. VIRTUAL:不存储列值,但在任何BEFORE触发器之后立即读取行时计算列值。 A virtual column takes no storage. 虚拟列不占用存储空间。

Ref: https://dev.mysql.com/doc/refman/5.7/en/create-table-generated-columns.html 参考: https//dev.mysql.com/doc/refman/5.7/en/create-table-generated-columns.html

We create virtual columns based on the sound principal that values generated by simple operations on columns shouldn't be stored to avoid redundancy, but by creating an index on it, we reintroduce redundancy. 我们基于声音主体创建虚拟列,不应存储由列上的简单操作生成的值以避免冗余,但通过在其上创建索引,我们重新引入冗余。

Proposed fixes 提议修复

based on the information provided, you don't really seem to need the status_expired column or even the expired column. 根据提供的信息,您似乎不需要status_expired列甚至是expired列。 An item that's past it's expiry date is expired! 超过它的失效日期的项目已过期!

CREATE TABLE `items` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `user_id` bigint(20) unsigned DEFAULT NULL,
  `image` json DEFAULT NULL,
  `status` json DEFAULT NULL,
  `expire_date` datetime GENERATED ALWAYS AS ((`create_date` + interval `lifetime` day)) VIRTUAL,
  `last_update` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `create_date` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  KEY `user_id` (`user_id`),
  KEY `expiration` (`expired_date`) USING BTREE,
  CONSTRAINT `ts_competition_item_ibfk_2` FOREIGN KEY (`user_id`) REFERENCES `ts_user_core` (`user_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1312459 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=COMPRESSED

Simply compare the current date with the expired_date column in the above table when you need to find out which items have expired. 当您需要找出哪些项目已过期时,只需将当前日期与上表中的expired_date列进行比较。 The difference here is instead of expired being a calculated item in every query, you calculate the expiry_date once, when you create the record. 这里的差异不是expired是每个查询中的计算项, expiry_date在创建记录时计算expiry_date一次。

This makes your table a lot neater and queries possibly faster 这使得你的桌子更整洁,查询可能更快

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM