简体   繁体   English

如何编写 redshift aws 查询以搜索逗号分隔值中的值

[英]How to write redshift aws query to search for a value in comma delimited values

table1表格1

user_id用户身份 country_code国家代码
1 1 'IN,AU,AC' 'IN,AU,AC'
2 2 'MX,IN' 'MX,IN'

table2表2

user_id用户身份 valid_country有效国家
1 1 'IN' '在'
1 1 'AU' '非盟'
2 2 'MX' 'MX'
3 3 'YT' 'YT'
4 4 'RU' '如'

As you can see, some entries in the country_code column are multiple codes separated by commas.如您所见,country_code 列中的某些条目是多个用逗号分隔的代码。 I would like to print user_id in table1 and their corresponding country_code only if they are valid.我只想在 table1 中打印 user_id 及其相应的 country_code ,前提是它们有效。 To check for validity here I need to use table2 which has user_id and valid_country.要在这里检查有效性,我需要使用具有 user_id 和 valid_country 的 table2。

The desired output is:所需的 output 是:

user_id用户身份 country_code国家代码
1 1 'IN' '在'
1 1 'AU' '非盟'
2 2 'MX' 'MX'

Query looks like查询看起来像

select tb1.user_id, country_code from table1 tb1, table2 tb2 where tb1.user_id=tb2.user_id and <Here I need to check if tb2.country_code is there in tb1.country_code (codes separated by commas)> select tb1.user_id, country_code from table1 tb1, table2 tb2 where tb1.user_id=tb2.user_id and <这里我需要检查tb1.country_code中是否有tb2.country_code(代码用逗号分隔)>

Are there any simple solution that I could check valid_country in the comma separated values.是否有任何简单的解决方案可以在逗号分隔值中检查 valid_country。

The simple way isn't always the best.简单的方法并不总是最好的。 There are a number of corner cases that can arise here (like are all country codes 2 letters).这里可能会出现许多极端情况(例如所有国家/地区代码都是 2 个字母)。 That said a LIKE clause would be simple:也就是说,LIKE 子句很简单:

select tb1.user_id, valid_country as country_code
from table1 tb1, table2 tb2 
where tb1.user_id=tb2.user_id 
  and tb1.country_code like '%'||tb2.valid_country||'%'

Or if we are to put this in modern SQL syntax:或者,如果我们要把它放在现代 SQL 语法中:

select tb1.user_id, valid_country as country_code
from table1 tb1 join table2 tb2 
on tb1.user_id=tb2.user_id 
  and tb1.country_code like '%'||tb2.valid_country||'%'

Try this:尝试这个:

a) Verticalise tb1 by CROSS JOIN ing it with a series of consecutive integers (which I supply in a Common Table Expression), and applying the SPLIT_PART() function to break the comma delimited list into single element. a) 通过CROSS JOINtb1与一系列连续整数(我在公用表表达式中提供)垂直化,并应用SPLIT_PART() function 将逗号分隔的列表分解为单个元素。

b) INNER JOIN the verticalised result with the valid user_id/country code combinations table on an equi-join on both columns. b) INNER JOIN垂直化结果与有效的 user_id/国家代码组合表在两列的等值连接上。

WITH   
-- your table 1, don't use in end query ...                                                                                                                                                                                      
tb1(user_id,country_code) AS (
          SELECT 1,'IN,AU,AC'
UNION ALL SELECT 2,'MX,IN'
)
,
-- your table 2, don't use in end query ...                                                                                                                                                                                      
tb2(user_id,valid_country) AS (
          SELECT 1,'IN'
UNION ALL SELECT 1,'AU'
UNION ALL SELECT 2,'MX'
UNION ALL SELECT 3,'YT'
UNION ALL SELECT 4,'RU'
)
-- real query starts here, replace following comma with "WITH" ...
,
i(i) AS ( -- need a series of integers ...
          SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
)
,
vertical AS (
  SELECT
    tb1.user_id
  , i
  , SPLIT_PART(country_code,',',i) AS valid_country
  FROM tb1 CROSS JOIN i
  WHERE SPLIT_PART(country_code,',',i) <> ''
)
SELECT
  vertical.user_id
, vertical.valid_country
FROM vertical
JOIN tb2 USING(user_id,valid_country)
ORDER BY vertical.user_id,vertical.i
;
-- out  user_id | valid_country 
-- out ---------+---------------
-- out        1 | IN
-- out        1 | AU
-- out        2 | MX

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM