I'm trying to put two tables together using struct and array. My idea is for each row in table A apply Levenshtein Distance to table B.
Table A:
col1
whisky
delta
Tango
Table B
col1
Whiskey
delta force
Tango is great
Desired output:
col1 col2 col3
whisky Whiskey <lv_distance_score>
delta force <lv_distance_score>
Tango is great <lv_distance_score>
delta Whiskey <lv_distance_score>
delta force <lv_distance_score>
Tango is great <lv_distance_score>
Tango Whiskey <lv_distance_score>
delta force <lv_distance_score>
Tango is great <lv_distance_score>
For this, first I'm trying to just get de desired output of col1 and col2, but I keep getting an error
that says Scalar subquery produced more than one element
.
The query I wrote is:
WITH a AS (
SELECT col1, [STRUCT((SELECT col1 FROM table_B))] AS col2 FROM table_A
)
SELECT col1,c2 FROM a,UNNEST(a.col2) AS c2;
What I'm doing wrong here? How can I achieve what I'm looking for?
I'm a bit lost. Why not just use a cross join
?
select a.col1, b.col1
from a cross join
b
If you want one row per row in a
with an array for b, then:
select a.col1, array_agg(b)
from a cross join
b
group by a.col1;
What I'm doing wrong here?
Below is simple fix for your original query
WITH a AS (
SELECT col1,
[STRUCT(ARRAY(SELECT col1 FROM table_B) as col2)] AS col2
FROM table_A
)
SELECT col1, c2.col2
FROM a, UNNEST(a.col2) AS c2;
While above hopefully shows you what was wrong with your query - I am not sure it is right direction to go.
How can I achieve what I'm looking for?
you just go with simple cross join like in below example
SELECT a.col1, ARRAY_AGG(b.col1 ORDER BY lv_distance_score(a.col1, b.col1) LIMIT 1)
FROM table_A a
CROSS JOIN table_B b
GROUP BY a.col1
Note: you can find plenty of examples for Levenshtein Distance UDF here on SO
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.