[英]How to write redshift aws query to search for a value in comma delimited values
table1表格1
user_id![]() |
country_code![]() |
---|---|
1 ![]() |
'IN,AU,AC' ![]() |
2 ![]() |
'MX,IN' ![]() |
table2表2
user_id![]() |
valid_country![]() |
---|---|
1 ![]() |
'IN' ![]() |
1 ![]() |
'AU' ![]() |
2 ![]() |
'MX' ![]() |
3 ![]() |
'YT' ![]() |
4 ![]() |
'RU' ![]() |
As you can see, some entries in the country_code column are multiple codes separated by commas.如您所见,country_code 列中的某些条目是多个用逗号分隔的代码。 I would like to print user_id in table1 and their corresponding country_code only if they are valid.
我只想在 table1 中打印 user_id 及其相应的 country_code ,前提是它们有效。 To check for validity here I need to use table2 which has user_id and valid_country.
要在这里检查有效性,我需要使用具有 user_id 和 valid_country 的 table2。
The desired output is:所需的 output 是:
user_id![]() |
country_code![]() |
---|---|
1 ![]() |
'IN' ![]() |
1 ![]() |
'AU' ![]() |
2 ![]() |
'MX' ![]() |
Query looks like查询看起来像
select tb1.user_id, country_code from table1 tb1, table2 tb2 where tb1.user_id=tb2.user_id and <Here I need to check if tb2.country_code is there in tb1.country_code (codes separated by commas)>
select tb1.user_id, country_code from table1 tb1, table2 tb2 where tb1.user_id=tb2.user_id and <这里我需要检查tb1.country_code中是否有tb2.country_code(代码用逗号分隔)>
Are there any simple solution that I could check valid_country in the comma separated values.是否有任何简单的解决方案可以在逗号分隔值中检查 valid_country。
The simple way isn't always the best.简单的方法并不总是最好的。 There are a number of corner cases that can arise here (like are all country codes 2 letters).
这里可能会出现许多极端情况(例如所有国家/地区代码都是 2 个字母)。 That said a LIKE clause would be simple:
也就是说,LIKE 子句很简单:
select tb1.user_id, valid_country as country_code
from table1 tb1, table2 tb2
where tb1.user_id=tb2.user_id
and tb1.country_code like '%'||tb2.valid_country||'%'
Or if we are to put this in modern SQL syntax:或者,如果我们要把它放在现代 SQL 语法中:
select tb1.user_id, valid_country as country_code
from table1 tb1 join table2 tb2
on tb1.user_id=tb2.user_id
and tb1.country_code like '%'||tb2.valid_country||'%'
Try this:尝试这个:
a) Verticalise tb1
by CROSS JOIN
ing it with a series of consecutive integers (which I supply in a Common Table Expression), and applying the SPLIT_PART()
function to break the comma delimited list into single element. a) 通过
CROSS JOIN
将tb1
与一系列连续整数(我在公用表表达式中提供)垂直化,并应用SPLIT_PART()
function 将逗号分隔的列表分解为单个元素。
b) INNER JOIN
the verticalised result with the valid user_id/country code combinations table on an equi-join on both columns. b)
INNER JOIN
垂直化结果与有效的 user_id/国家代码组合表在两列的等值连接上。
WITH
-- your table 1, don't use in end query ...
tb1(user_id,country_code) AS (
SELECT 1,'IN,AU,AC'
UNION ALL SELECT 2,'MX,IN'
)
,
-- your table 2, don't use in end query ...
tb2(user_id,valid_country) AS (
SELECT 1,'IN'
UNION ALL SELECT 1,'AU'
UNION ALL SELECT 2,'MX'
UNION ALL SELECT 3,'YT'
UNION ALL SELECT 4,'RU'
)
-- real query starts here, replace following comma with "WITH" ...
,
i(i) AS ( -- need a series of integers ...
SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
)
,
vertical AS (
SELECT
tb1.user_id
, i
, SPLIT_PART(country_code,',',i) AS valid_country
FROM tb1 CROSS JOIN i
WHERE SPLIT_PART(country_code,',',i) <> ''
)
SELECT
vertical.user_id
, vertical.valid_country
FROM vertical
JOIN tb2 USING(user_id,valid_country)
ORDER BY vertical.user_id,vertical.i
;
-- out user_id | valid_country
-- out ---------+---------------
-- out 1 | IN
-- out 1 | AU
-- out 2 | MX
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.