简体   繁体   中英

MySQL 5.7 - JSON Indexing - Generated Columns with non-scalar values

I have been playing with the JSON support in MySQL 5.7. I have a few questions about the generated columns for the purpose of indexing. https://dev.mysql.com/doc/refman/5.7/en/create-table.html#create-table-secondary-indexes-virtual-columns .

Specifically, refer to this line:

JSON columns cannot be indexed. You can work around this restriction by creating an index on a generated column that extracts a scalar value from the JSON column.

This seems to be a big limitation for me. Everywhere I look, people suggest using generated columns. But that workaround would work for a very limited set of use-cases. Or, I am understanding something wrong.

Setting the stage

Let me explain my use-case. Suppose you have a table called standards . It has the following structure:

CREATE TABLE `standards` (
  `id` int(11) NOT NULL,
  `name` varchar(100) NOT NULL,
  `sections` json DEFAULT NULL,
  `subjects` json DEFAULT NULL,
  `created_at` datetime NOT NULL,
  `updated_at` datetime NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

The sections column contains an array of JS objects:

[
  {
    "id": 90491,
    "name": "A",
  },
  {
    "id": 90494,
    "name": "B",
  }
]

The subjects column contains a nested JS object:

{
  "576845": {
    "id": 576845,
    "name": "Computer Education"
  },
  "576848": {
    "id": 576848,
    "name": "English Language"
  },
  "576854": {
    "id": 576854,
    "name": "Environmental Science"
  },
  "576860": {
    "id": 576860,
    "name": "Mathematics"
  }
}

Example Queries

Query 1

To find a Standard record which has a section ID of 90494 , the query would be:

SELECT * from standards WHERE JSON_CONTAINS( sections->>'$[*].id', '90494' );

Query 2

To find a Standard record which has the subject ID of 576854 , the query would be:

SELECT * from standards WHERE JSON_CONTAINS_PATH( subjects, 'one', '$."576854"');

OR

SELECT * from standards WHERE JSON_CONTAINS( subjects->>'$.*.id', '576854' );

Problem

Now, all the above works. The problem is that the queries perform a full table scan.

Considering Query 1 from above, how can I generate a virtual column with scalar data which contains ALL section IDs ?

Each Standard record has multiple sections , with multiple IDs. So, I can't just create an integer virtual column to store a single value. It has to be an array of section IDs, through which we need to search.

So, my generated column would be like below:

ALTER TABLE standards
ADD section_ids json GENERATED ALWAYS AS (sections->>'$[*].id') VIRTUAL NOT NULL;

The generated column will now store just the array of section IDs. But I cannot add an index on the generated column, because it is again a JSON column.

Question - How to utilize index?

So, the question comes down to this - for my queries shown above, how do I avoid full table scans?

Any suggestions would be appreciated.

I won't say it isn't possible with MySQL 5.7 - because it is, with clunky workarounds and limitations - but I will not go into a how-to with that version as it is much more difficult and the limitations will, in many cases, be reached if a large number of items can be added to the array.

However, it is possible as of MySQL 8.0.17 which now supports multi-valued indexes .

ALTER TABLE standards
  ADD INDEX section_ids ( (CAST(sections->'$[*].id' AS UNSIGNED ARRAY)) ),
  ADD INDEX subject_ids ( (CAST(subjects->'$.*.id'AS UNSIGNED ARRAY)) );

** Note that $.* will take all object properties and return the queried values ( .id ) of each formatted as an array.

EXPLAIN SELECT * from standards WHERE JSON_CONTAINS( sections->'$[*].id', '90494' );
EXPLAIN SELECT * from standards WHERE JSON_CONTAINS( subjects->'$.*.id', 576854 );

You will see that the indexes are used for those queries.

I would solve this in older versions by manually creating a separate index table, and using triggers to keep it up to date.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM