简体   繁体   English

SQL查询以逗号分隔的字符串与逗号分隔的字符串匹配?

[英]SQL query to match a comma-separated string against a comma-separated string?

The MySQL query below uses PHP to pull in the $sector, which is a single digit, and the $subsector_text, which is a comma separated string. 下面的MySQL查询使用PHP提取$ sector(这是一个数字)和$ subsector_text(这是一个逗号分隔的字符串)。 The $subsector_text could be a single digit or a list of several IDs, such as "3,4,7,9". $ subsector_text可以是一个数字或几个ID的列表,例如“ 3,4,7,9”。

  $sql = "
SELECT DISTINCT a.id
              , a.name
              , a.category_id
              , a.sector
              , a.subsector
              , a.year_in_operation
              , a.state
              , a.total_value
              , b.country_id
              , b.project_id
              , c.isocode_3
              , c.name
           FROM com_barchan_project a
           JOIN com_barchan_location b
             ON b.project_id = a.id
           JOIN com_barchan_country c
             ON c.id = b.country_id
           JOIN com_barchan_project_value_join d
             ON a.id = d.project_id
          WHERE a.state = 1 
            AND a.sector = '$sector'
            AND a.subsector REGEXP '^{$subsector_text}[,]|[,]{$subsector_text}[,]|[,]{$subsector_text}$|^{$subsector_text}$'
          ORDER 
             BY a.total_value DESC
              , a.category_id ASC
              , a.name ASC
";

The problem I'm having with the query above is with the line: 我上面的查询遇到的问题是与行:

AND a.subsector REGEXP '^{$subsector_text}[,]|[,]{$subsector_text}[,]|[,]{$subsector_text}$|^{$subsector_text}$'  

If the $subsector_text = "3,4,5,9", then it's only returning records that contain exactly "3,4,5,9" in the $subsector field. 如果$ subsector_text =“ 3,4,5,9”,则仅返回$ subsector字段中包含完全为“ 3,4,5,9”的记录。

The desired result is that it would return any record that has any of the values in the $subsector_text. 理想的结果是它将返回任何在$ subsector_text中具有任何值的记录。 For instance, all these should be returned, but are currently not. 例如,所有这些都应返回,但当前不返回。 This list is an example and by no means exact. 此列表只是一个示例,绝不完全正确。

1,3
1,5
1,3,7,9
3,5
3,4,5,9
9
3
5
4

How do I change the query to select any records that has a value in that's in the $subsector_text string? 如何更改查询以选择$ subsector_text字符串中具有值的任何记录?

Please NOTE: That if the $subsector_text = 11, then the following, as an example, should not be selected. 请注意:如果$ subsector_text = 11,则不应选择以下示例。

1
12
21

Any help would be greatly appreciated. 任何帮助将不胜感激。

It's not practical to match any value in a comma-separate string against any value in another comma-separated string in a single predicate. 在单个谓词中将逗号分隔的字符串中的任何值与另一个逗号分隔的字符串中的任何值进行匹配都是不切实际的。

You can use FIND_IN_SET() to search for one value at a time. 您可以使用FIND_IN_SET()一次搜索一个值。

This means you need multiple predicates, one for each value you get by splitting your input $subsector_text . 这意味着您需要多个谓词,通过分割输入$subsector_text获得的每个谓词。 So split your variable and map it into a series of FIND_IN_SET() calls. 因此,请拆分变量并将其映射到一系列FIND_IN_SET()调用中。

I haven't tested the following code, but it should give you the idea of what I'm talking about: 我尚未测试以下代码,但是它应该使您了解我在说什么:

$subsector_array = array_map('intval', explode(',', $subsector_text));
$subsector_terms = array_map(
  function ($id) { return "FIND_IN_SET($id, a.subsector)"; },
  $subsector_array);
$subsector_expr = implode(' OR ', $subsector_terms);

$sql = "
SELECT ...
          WHERE a.state = 1 
            AND a.sector = '$sector'
            AND ($subsector_expr)
...";

This will of course force a table-scan because there's no way to index FIND_IN_SET(), or any other operation that searches for substrings. 当然,这将强制进行表扫描,因为无法索引FIND_IN_SET()或任何其他搜索子字符串的操作。 Well, I suppose your conditions on a.state and a.sector will use an index to narrow down the search before applying the FIND_IN_SET() conditions. 好吧,我想您在a.state上的条件和a.sector将在应用FIND_IN_SET()条件之前使用索引来缩小搜索范围。

I understand the dilemma of having to work with a system that you inherited. 我了解必须使用您继承的系统的两难境地。 Let your manager know that this needs to get refactored at some point, because it will never be efficient or reliable the way it's designed now. 让您的经理知道这需要在某个时候进行重构,因为它永远不会像现在设计的那样高效或可靠。

Your approach is correct, but need some modifications. 您的方法是正确的,但需要进行一些修改。 Instead of try to match in only one condition (REGEXP), can create multiple conditions joined with OR ... 除了尝试仅在一种条件下进行匹配(REGEXP)外,还可以创建多个通过OR条件...

Example: 例:

$subsectorArray = explode(',', $subsector_text);
$or = [];
foreach ($subsectorArray as $subsector){
    $or[] = "a.subsector REGEXP '[^[:alnum:]]{$subsector}[^[:alnum:]]|^{$subsector}[^[:alnum:]]|[^[:alnum:]]{$subsector}$|^{$subsector}$'";
}
$orStr = implode(' OR ', $or);

 $sql = "
SELECT DISTINCT a.id
              , a.name
              , a.category_id
              , a.sector
              , a.subsector
              , a.year_in_operation
              , a.state
              , a.total_value
              , b.country_id
              , b.project_id
              , c.isocode_3
              , c.name
           FROM com_barchan_project a
           JOIN com_barchan_location b
             ON b.project_id = a.id
           JOIN com_barchan_country c
             ON c.id = b.country_id
           JOIN com_barchan_project_value_join d
             ON a.id = d.project_id
          WHERE a.state = 1 
            AND a.sector = '$sector'
            AND ($orStr)
          ORDER 
             BY a.total_value DESC
              , a.category_id ASC
              , a.name ASC
";

The solution was to refactor the app. 解决方案是重构应用程序。 It took a couple days, but the offending code is gone and a new subsector table was created. 花费了几天时间,但是令人讨厌的代码消失了,并创建了一个新的子部门表。 Thanks everyone. 感谢大家。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM