简体   繁体   English

MySQL 如何在 JSON 字段中搜索与 JSON 值相交的所有行

[英]MySQL how to search a JSON field for all rows that intersect with JSON values

Assume that I have a MySQL table with a JSON field, with the structure & values shown below:假设我有一个带有 JSON 字段的 MySQL 表,其结构和值如下所示:

CREATE TABLE `projects` (
  `project_ids` json DEFAULT NULL
) ENGINE=InnoDB;

INSERT INTO `projects` (`project_ids`)
VALUES
    ('[1, 2, 3]'),
    ('[1, 2]'),
    ('[2]'),
    ('[1]'),
    ('[2, 3]');

I would like to get all rows that have ANY of the JSON values that match in a SELECT query.我想获取在 SELECT 查询中匹配的任何 JSON 值的所有行。

For example:例如:

SELECT * FROM projects WHERE JSON_CONTAINS(project_ids, '[1,2]');

This yields 2 rows:这会产生 2 行:

[1, 2, 3]
[1, 2]

However, I would like it to yield all of the rows, because every row has at least a "1" or a "2" in the JSON.但是,我希望它产生所有行,因为 JSON 中的每一行至少有一个“1”或“2”。 In other words, I'm looking for a query that will return rows where the JSON intersection is non-empty.换句话说,我正在寻找一个查询,该查询将返回 JSON 交点为非空的行。

Thus, I WANT it to return:因此,我希望它返回:

[1, 2, 3]
[1, 2]
[2]
[1]
[2, 3]

Please ignore the larger question of whether it's even a good idea to use JSON fields over foreign keys, etc. Assume that I have to use a JSON field.请忽略更大的问题,即在外键上使用 JSON 字段等是否是个好主意。假设我必须使用 JSON 字段。

OR is the simplest method: OR是最简单的方法:

SELECT *
FROM projects
WHERE JSON_CONTAINS(project_ids, '[1]') OR
      JSON_CONTAINS(project_ids, '[2]') ;

But if you want to pass in the JSON array, use JSON_OVERLAPS() :但如果你想传入 JSON 数组,请使用JSON_OVERLAPS()

SELECT *
FROM projects
WHERE JSON_OVERLAPS(project_ids, '[1,2]') ;

Here is a db<>fiddle. 是一个 db<>fiddle。

The answer from @GordonLinoff gives the correct result, but it's worth mentioning that most solutions that use JSON functions cannot be optimized with indexes. @GordonLinoff 的答案给出了正确的结果,但值得一提的是,大多数使用 JSON 函数的解决方案无法使用索引进行优化。 You should consider what happens as the table gets larger.您应该考虑随着表变大会发生什么。 Every query will do a table-scan, and that becomes worse and worse for performance as the table grows.每个查询都会进行一次表扫描,随着表的增长,性能会变得越来越差。

EXPLAIN shows this:解释显示:

+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table    | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | projects | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    1 |   100.00 | Using where |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+

(the rows column will be as large as the number of rows in the table) rows将与表中的行数一样大)

Instead, an easier way to implement this kind of query is to store the data in normal rows and columns, not JSON.相反,实现这种查询的一种更简单的方法是将数据存储在普通的行和列中,而不是 JSON。 Store one project_id per row.每行存储一个 project_id。

Then the query becomes much more clear, and it can be optimized with an index:然后查询变得更加清晰,并且可以使用索引进行优化:

SELECT * FROM projects WHERE project_id IN (1, 2);

I understand you mentioned in your question that we should take as a given that using JSON is a requirement.我了解您在问题中提到我们应该认为使用 JSON 是一项要求。 You have a solution to get the result you described.您有一个解决方案来获得您描述的结果。 I just want to make clear to other readers:我只想向其他读者说明:

If your query references a JSON column in any clause other than the select-list, it's a sign that you should store the data in a normalized form, not in JSON.如果您的查询在除选择列表之外的任何子句中引用了 JSON 列,这表明您应该以规范化形式存储数据,而不是 JSON。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM