繁体   English   中英

来自 JSON 数组的 mySQL WHERE IN

[英]mySQL WHERE IN from JSON Array

我有一个包含 JSON 数据的表,以及一个为每一行提取一组 ID 的语句......

SELECT items.data->"$.matrix[*].id" as ids
FROM items

这会导致类似..

+------------+
|    ids     |
+------------+
| [1,2,3]    |
+------------+

接下来,我想从另一个表中选择另一个表的 ID,该表的 ID 位于数组中,类似于WHERE id IN ('1,2,3')但使用 JSON 数组...

类似的东西...

SELECT * FROM other_items 
WHERE id IN ( 
  SELECT items.data->"$.matrix[*].id" FROM items
);

但它需要一些 JSON 魔法,我无法解决......

下面是一个完整的答案。 你可能想要一个'use <db_name>;' 脚本顶部的语句。 重点是表明 JSON_CONTAINS() 可用于实现所需的连接。

DROP TABLE IF EXISTS `tmp_items`;
DROP TABLE IF EXISTS `tmp_other_items`;

CREATE TABLE `tmp_items` (`id` int NOT NULL PRIMARY KEY AUTO_INCREMENT, `data` json NOT NULL);
CREATE TABLE `tmp_other_items` (`id` int NOT NULL, `text` nvarchar(30) NOT NULL);

INSERT INTO `tmp_items` (`data`) 
VALUES 
    ('{ "matrix": [ { "id": 11 }, { "id": 12 }, { "id": 13 } ] }')
,   ('{ "matrix": [ { "id": 21 }, { "id": 22 }, { "id": 23 }, { "id": 24 } ] }')
,   ('{ "matrix": [ { "id": 31 }, { "id": 32 }, { "id": 33 }, { "id": 34 }, { "id": 35 } ] }')
;

INSERT INTO `tmp_other_items` (`id`, `text`) 
VALUES 
    (11, 'text for 11')
,   (12, 'text for 12')
,   (13, 'text for 13')
,   (14, 'text for 14 - never retrieved')
,   (21, 'text for 21')
,   (22, 'text for 22')
-- etc...
;

-- Show join working:
SELECT 
    t1.`id` AS json_table_id
,   t2.`id` AS joined_table_id
,   t2.`text` AS joined_table_text
FROM 
    (SELECT st1.id, st1.data->'$.matrix[*].id' as ids FROM `tmp_items` st1) t1
INNER JOIN `tmp_other_items` t2 ON JSON_CONTAINS(t1.ids, CAST(t2.`id` as json), '$')

您应该会看到以下结果:

结果

请注意,接受的答案不会在tmp_other_items上使用索引, tmp_other_items导致较大表的性能降低。

在这种情况下,我通常使用integers表,其中包含从 0 到任意固定数字 N(以下,大约为 100 万)的整数,然后我加入该整数表以获取第 n 个 JSON 元素:

DROP TABLE IF EXISTS `integers`;
DROP TABLE IF EXISTS `tmp_items`;
DROP TABLE IF EXISTS `tmp_other_items`;

CREATE TABLE `integers` (`n` int NOT NULL PRIMARY KEY);
CREATE TABLE `tmp_items` (`id` int NOT NULL PRIMARY KEY AUTO_INCREMENT, `data` json NOT NULL);
CREATE TABLE `tmp_other_items` (`id` int NOT NULL PRIMARY KEY, `text` nvarchar(30) NOT NULL);

INSERT INTO `tmp_items` (`data`) 
VALUES 
    ('{ "matrix": [ { "id": 11 }, { "id": 12 }, { "id": 13 } ] }'),
   ('{ "matrix": [ { "id": 21 }, { "id": 22 }, { "id": 23 }, { "id": 24 } ] }'),
   ('{ "matrix": [ { "id": 31 }, { "id": 32 }, { "id": 33 }, { "id": 34 }, { "id": 35 } ] }')
;

-- Put a lot of rows in integers (~1M)
INSERT INTO `integers` (`n`) 
(
    SELECT 
        a.X
        + (b.X << 1)
        + (c.X << 2)
        + (d.X << 3)
        + (e.X << 4)
        + (f.X << 5)
        + (g.X << 6)
        + (h.X << 7)
        + (i.X << 8)
        + (j.X << 9)
        + (k.X << 10)
        + (l.X << 11)
        + (m.X << 12)
        + (n.X << 13)
        + (o.X << 14)
        + (p.X << 15)
        + (q.X << 16)
        + (r.X << 17)
        + (s.X << 18)
        + (t.X << 19) AS i
    FROM (SELECT 0 AS x UNION SELECT 1) AS a
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS b ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS c ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS d ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS e ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS f ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS g ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS h ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS i ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS j ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS k ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS l ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS m ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS n ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS o ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS p ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS q ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS r ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS s ON TRUE
        INNER JOIN (SELECT 0 AS x UNION SELECT 1) AS t ON TRUE)
;

-- Insert normal rows (a lot!)
INSERT INTO `tmp_other_items` (`id`, `text`) 
    (SELECT n, CONCAT('text for ', n) FROM integers);

现在你可以再次尝试接受答案的查询,运行大约需要 11(但很简单):

-- Show join working (slow)
SELECT 
    t1.`id` AS json_table_id
,   t2.`id` AS joined_table_id
,   t2.`text` AS joined_table_text
FROM 
    (SELECT st1.id, st1.data->'$.matrix[*].id' as ids FROM `tmp_items` st1) t1
INNER JOIN `tmp_other_items` t2 ON JSON_CONTAINS(t1.ids, CAST(t2.`id` as JSON), '$')
;

并将其与将 JSON 转换为(临时)id 表,然后对其进行 JOIN 的更快方法进行比较(根据 heidiSQL,这会导致即时结果,0.000 秒):

-- Fast
SELECT
    i.json_table_id,
    t2.id AS joined_table_id,
    t2.`text` AS joined_table_text
FROM (
    SELECT 
        j.json_table_id,
        -- Don't forget to CAST if needed, so the column type matches the index type
        -- Do an "EXPLAIN" and check its warnings if needed
        CAST(JSON_EXTRACT(j.ids, CONCAT('$[', i.n - 1, ']')) AS UNSIGNED) AS id
    FROM (
        SELECT 
            st1.id AS json_table_id,
            st1.data->'$.matrix[*].id' as ids,
            JSON_LENGTH(st1.data->'$.matrix[*].id') AS len
        FROM `tmp_items` AS st1) AS j
        INNER JOIN integers AS i ON i.n BETWEEN 1 AND len) AS i
    INNER JOIN tmp_other_items AS t2 ON t2.id = i.id
    ;

最内部的SELECT检索 JSON id 列表及其长度(用于外连接)。

第二个内部 SELECT 获取此 id 列表,并在整数上 JOIN 以检索每个 JSON 列表的第 n 个 id,从而生成一个 id 表(而不是 json 表)。

最外面的 SELECT 现在只需要将这个 id 表与包含您想要的数据的表连接起来。

下面是使用 WHERE IN 的相同查询,以匹配问题标题:

-- Fast (using WHERE IN)
SELECT t2.*
FROM tmp_other_items AS t2
WHERE t2.id IN (
    SELECT 
        CAST(JSON_EXTRACT(j.ids, CONCAT('$[', i.n - 1, ']')) AS UNSIGNED) AS id
    FROM (
        SELECT 
            st1.data->'$.matrix[*].id' as ids, 
            JSON_LENGTH(st1.data->'$.matrix[*].id') AS len
        FROM `tmp_items` AS st1) AS j
        INNER JOIN integers AS i ON i.n BETWEEN 1 AND len)
    ;

在 MySQL 中引入 JSON 之前,我使用这个:

  1. 你的原始数据: [1,2,3]

  2. 用']['替换逗号后: [1][2][3]

  3. 将您的 id 包裹在 '[]' 中

  4. 然后使用 REVERSE LIKE 而不是 IN: WHERE '[1][2][3]' LIKE '%[1]%'

回答你的问题:

SELECT * FROM other_items 
WHERE
    REPLACE(SELECT items.data->"$.matrix[*].id" FROM items, ',', '][')
    LIKE CONCAT('%', CONCAT('[', id, ']'), '%')

为什么包装成'[]'

'[12,23,34]' LIKE '%1%' --> true
'[12,23,34]' LIKE '%12%' --> true

如果包装成 '[]'

'[12][23][34]' LIKE '%[1]%' --> false
'[12][23][34]' LIKE '%[12]%' --> true

从 MySQL 8.0.13 开始,有MEMBER OF运算符,它完全符合您的要求。

不过,应该以JOIN的形式重写查询:

SELECT o.* FROM other_items o
JOIN items i ON o.id MEMBER OF(i.data->>'$.id')

如果您希望查询具有更好的性能,请考虑在 JSON 列上使用多值索引


可以在以下示例中更清楚地解释MEMBER OF()使用:

CREATE TABLE items ( data JSON );

INSERT INTO items
SET data = '{"id":[1,2,3]}';

这就是您如何确定该值是否存在于 JSON 数组中的方法:

SELECT * FROM items
WHERE 3 MEMBER OF(data->>'$.id');
+-------------------+
| data              |
+-------------------+
| {"id": [1, 2, 3]} |
+-------------------+
1 row in set (0.00 sec)

请注意,与常规比较不同,在这种情况下值的类型很重要 如果以字符串形式传递,则不会有匹配项:

SELECT * FROM items
WHERE "3" MEMBER OF(data->>'$.id');

Empty set (0.00 sec)

虽然常规比较会返回1

SELECT 3 = "3";
+---------+
| 3 = "3" |
+---------+
|       1 |
+---------+
1 row in set (0.00 sec)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM