简体   繁体   English

MySQL 5.7 - JSON 索引 - 生成具有非标量值的列

[英]MySQL 5.7 - JSON Indexing - Generated Columns with non-scalar values

I have been playing with the JSON support in MySQL 5.7.我一直在研究 MySQL 5.7 中的 JSON 支持。 I have a few questions about the generated columns for the purpose of indexing.我对出于索引目的生成的列有几个问题。 https://dev.mysql.com/doc/refman/5.7/en/create-table.html#create-table-secondary-indexes-virtual-columns . https://dev.mysql.com/doc/refman/5.7/en/create-table.html#create-table-secondary-indexes-virtual-columns

Specifically, refer to this line:具体参考这一行:

JSON columns cannot be indexed.无法为 JSON 列编制索引。 You can work around this restriction by creating an index on a generated column that extracts a scalar value from the JSON column.您可以通过在从 JSON 列中提取量值的生成列上创建索引来解决此限制。

This seems to be a big limitation for me.这对我来说似乎是一个很大的限制。 Everywhere I look, people suggest using generated columns.无论我在哪里,人们都建议使用生成的列。 But that workaround would work for a very limited set of use-cases.但该解决方法适用于非常有限的一组用例。 Or, I am understanding something wrong.或者,我理解错了。

Setting the stage搭建舞台

Let me explain my use-case.让我解释一下我的用例。 Suppose you have a table called standards .假设您有一个名为standards的表。 It has the following structure:它具有以下结构:

CREATE TABLE `standards` (
  `id` int(11) NOT NULL,
  `name` varchar(100) NOT NULL,
  `sections` json DEFAULT NULL,
  `subjects` json DEFAULT NULL,
  `created_at` datetime NOT NULL,
  `updated_at` datetime NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

The sections column contains an array of JS objects: sections列包含一个 JS 对象数组:

[
  {
    "id": 90491,
    "name": "A",
  },
  {
    "id": 90494,
    "name": "B",
  }
]

The subjects column contains a nested JS object: subjects列包含一个嵌套的 JS 对象:

{
  "576845": {
    "id": 576845,
    "name": "Computer Education"
  },
  "576848": {
    "id": 576848,
    "name": "English Language"
  },
  "576854": {
    "id": 576854,
    "name": "Environmental Science"
  },
  "576860": {
    "id": 576860,
    "name": "Mathematics"
  }
}

Example Queries示例查询

Query 1查询 1

To find a Standard record which has a section ID of 90494 , the query would be:要查找section ID90494Standard记录,查询将是:

SELECT * from standards WHERE JSON_CONTAINS( sections->>'$[*].id', '90494' );

Query 2查询 2

To find a Standard record which has the subject ID of 576854 , the query would be:要查找subject ID576854Standard记录,查询将是:

SELECT * from standards WHERE JSON_CONTAINS_PATH( subjects, 'one', '$."576854"');

OR要么

SELECT * from standards WHERE JSON_CONTAINS( subjects->>'$.*.id', '576854' );

Problem问题

Now, all the above works.现在,以上所有的工作。 The problem is that the queries perform a full table scan.问题是查询执行全表扫描。

Considering Query 1 from above, how can I generate a virtual column with scalar data which contains ALL section IDs ?考虑上面的查询 1,我如何生成一个包含所有section IDs标量数据的虚拟列?

Each Standard record has multiple sections , with multiple IDs.每个Standard记录都有多个sections ,具有多个 ID。 So, I can't just create an integer virtual column to store a single value.所以,我不能只创建一个整数虚拟列来存储单个值。 It has to be an array of section IDs, through which we need to search.它必须是一个节 ID 数组,我们需要通过它进行搜索。

So, my generated column would be like below:因此,我生成的列如下所示:

ALTER TABLE standards
ADD section_ids json GENERATED ALWAYS AS (sections->>'$[*].id') VIRTUAL NOT NULL;

The generated column will now store just the array of section IDs.生成的列现在将仅存储部分 ID 数组。 But I cannot add an index on the generated column, because it is again a JSON column.但是我不能在生成的列上添加索引,因为它又是一个 JSON 列。

Question - How to utilize index?问题 - 如何使用索引?

So, the question comes down to this - for my queries shown above, how do I avoid full table scans?所以,问题归结为 - 对于上面显示的查询,我如何避免全表扫描?

Any suggestions would be appreciated.任何建议,将不胜感激。

I won't say it isn't possible with MySQL 5.7 - because it is, with clunky workarounds and limitations - but I will not go into a how-to with that version as it is much more difficult and the limitations will, in many cases, be reached if a large number of items can be added to the array.我不会说 MySQL 5.7 是不可能的——因为它有笨重的变通方法和限制——但我不会讨论如何使用该版本,因为它要困难得多,而且在许多方面都会有限制情况下,如果可以将大量项目添加到数组中,则可以达到。

However, it is possible as of MySQL 8.0.17 which now supports multi-valued indexes .但是,从现在支持多值索引的 MySQL 8.0.17开始是可能的。

ALTER TABLE standards
  ADD INDEX section_ids ( (CAST(sections->'$[*].id' AS UNSIGNED ARRAY)) ),
  ADD INDEX subject_ids ( (CAST(subjects->'$.*.id'AS UNSIGNED ARRAY)) );

** Note that $.* will take all object properties and return the queried values ( .id ) of each formatted as an array. ** 请注意$.*将采用所有对象属性并返回每个格式化为数组的查询值 ( .id )。

EXPLAIN SELECT * from standards WHERE JSON_CONTAINS( sections->'$[*].id', '90494' );
EXPLAIN SELECT * from standards WHERE JSON_CONTAINS( subjects->'$.*.id', 576854 );

You will see that the indexes are used for those queries.您将看到索引用于这些查询。

I would solve this in older versions by manually creating a separate index table, and using triggers to keep it up to date.我会在旧版本中通过手动创建一个单独的索引表并使用触发器使其保持最新来解决这个问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM