简体   繁体   English

MySQL 通过查找将 2 个逗号分隔的列拆分为多行

[英]MySQL Split 2 comma separated columns into multiple rows with lookup

I have a table where two columns have values separated by a comma, and another table of categories.我有一个表,其中列的值用逗号分隔,另一个表的类别。 group_1 can have anything up to 30 separated values and group_2 can have anything up to 5 separated values group_1 最多可以有 30 个分隔值,group_2 最多可以有 5 个分隔值

myProducts

alias别名 group_1 group_1 group_2 group_2
product_a产品_a 1,2,3,[...] 1,2,3,[...] uk, us [...]英国,我们 [...]
product_b产品_b 2,4,[...] 2,4,[...] uk, us, [...]英国,我们,[...]
product_c产品_c 1,4,[...] 1,4,[...] spain, germany, [...]西班牙、德国、[...]

myCategories

id ID category类别
1 1 category_a类别_a
2 2 category_b类别_b
3 3 category_c类别_c
4 4 category_d类别_d

Is it possible with MySQL to split both the comma separated fields into multiple rows and return the results after looking up the value from the categories table. MySQL 是否可以将逗号分隔的字段拆分为多行并在从类别表中查找值后返回结果。 So in the example above, the FIRST row of the original table would return:所以在上面的例子中,原始表的第一行将返回:

alias别名 group_1 group_1 group_2 group_2
product_a产品_a category_a类别_a uk英国
product_a产品_a category_a类别_a us我们
product_a产品_a category_b类别_b uk英国
product_a产品_a category_b类别_b us我们
product_a产品_a category_c类别_c uk英国
product_a产品_a category_c类别_c us我们
... ... ... ... ... ...

The lookup part is desired, but if that proves to be too complicated, I can live without that part.查找部分是需要的,但如果这被证明太复杂,我可以没有那个部分。

Yes, but you won't be happy with the performance.是的,但您不会对性能感到满意。

You can match a comma-separated list against an individual value using MySQL's FIND_IN_SET() function.您可以使用 MySQL 的 FIND_IN_SET() function 将逗号分隔的列表与单个值进行匹配。

select p.alias, a.category as group_1, c.country as group_2 
from myProducts p join myCategories a on find_in_set(a.id, p.group_1) 
join countries c on find_in_set(c.country, p.group_2);

+-----------+------------+---------+
| alias     | group_1    | group_2 |
+-----------+------------+---------+
| product_c | category_a | spain   |
| product_c | category_d | spain   |
| product_a | category_a | uk      |
| product_a | category_b | uk      |
| product_b | category_b | uk      |
| product_a | category_c | uk      |
| product_b | category_d | uk      |
+-----------+------------+---------+

I did create another lookup table countries :我确实创建了另一个查找表countries

create table countries (country varchar(20) primary key);
insert into countries values ('uk'),('us'),('spain'),('germany');

Caveat: if the comma-separated list has spaces, they will be treated as part of each string in the list, so you want to remove spaces.警告:如果逗号分隔的列表有空格,它们将被视为列表中每个字符串的一部分,因此您要删除空格。

select p.alias, a.category as group_1, c.country as group_2 
from myProducts p join myCategories a on find_in_set(a.id, p.group_1) 
join countries c on find_in_set(c.country, replace(p.group_2,' ',''));

+-----------+------------+---------+
| alias     | group_1    | group_2 |
+-----------+------------+---------+
| product_c | category_a | germany |
| product_c | category_d | germany |
| product_c | category_a | spain   |
| product_c | category_d | spain   |
| product_a | category_a | uk      |
| product_a | category_b | uk      |
| product_b | category_b | uk      |
| product_a | category_c | uk      |
| product_b | category_d | uk      |
| product_b | category_b | us      |
| product_b | category_d | us      |
+-----------+------------+---------+

But there's no way to optimize the lookups with indexes if you do this.但是,如果您这样做,则无法使用索引优化查找。 So every join will be a table-scan.所以每个连接都将是一个表扫描。 As your tables gets larger, you'll find the performance degrades to the point of being unusable.随着您的表变大,您会发现性能下降到无法使用的程度。

The way to optimize this is to avoid using comma-separated lists.优化这一点的方法是避免使用逗号分隔的列表。 Normalize many-to-many relationships into new tables.将多对多关系规范化为新表。 Then the lookups can use indexes, and you'll avoid the degraded performance, in addition to all the other problems with using comma-separated lists .然后查找可以使用索引,除了使用逗号分隔列表的所有其他问题之外,您将避免性能下降。


Re your comment:回复您的评论:

You can create a derived table by listing countries explicitly:您可以通过明确列出国家/地区来创建派生表:

FROM ...
JOIN (
  SELECT 'us' AS country UNION SELECT 'uk' UNION SELECT 'spain' UNION SELECT 'germany'
) AS c

But this is getting pretty ridiculous.但这变得非常荒谬。 You aren't using SQL to any advantage.您没有使用 SQL 来获得任何优势。 You might as well just fetch the whole dataset back into your client application and sort it into some data structures in memory.您不妨将整个数据集取回您的客户端应用程序,并将其分类为 memory 中的一些数据结构。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM