My earlier question was resolved. Now I need to develop a related, but more complex query.
I have a table like this:
id description additional_info
-------------------------------------------
123 games XYD
124 Festivals sport swim
And I need to count matches to arrays like this:
array_content varchar[] := {"Festivals,games","sport,swim"}
If either of the columns description
and additional_info
contains any of the tags separated by a comma, we count that as 1. So each array element (consisting of multiple words) can only contribute 1 to the total count.
The result for the above example should be:
id RID Matches
1 123 1
2 124 2
The answer isn't simple, but figuring out what you are asking was harder:
SELECT row_number() OVER (ORDER BY t.id) AS id
, t.id AS "RID"
, count(DISTINCT a.ord) AS "Matches"
FROM tbl t
LEFT JOIN (
unnest(array_content) WITH ORDINALITY x(elem, ord)
CROSS JOIN LATERAL
unnest(string_to_array(elem, ',')) txt
) a ON t.description ~ a.txt
OR t.additional_info ~ a.txt
GROUP BY t.id;
Produces your desired result exactly.
array_content
is your array of search terms.
Each array element of the outer array in your search term is a comma-separated list. Decompose the odd construct by unnesting twice (after transforming each element of the outer array into another array). Example:
SELECT *
FROM unnest('{"Festivals,games","sport,swim"}'::varchar[]) WITH ORDINALITY x(elem, ord)
CROSS JOIN LATERAL
unnest(string_to_array(elem, ',')) txt;
Result:
elem | ord | txt
-----------------+-----+------------
Festivals,games | 1 | Festivals
Festivals,games | 1 | games
sport,swim | 2 | sport
sport,swim | 2 | swim
Since you want to count matches for each outer array element once , we generate a unique number on the fly with WITH ORDINALITY
. Details:
Now we can LEFT JOIN
to this derived table on the condition of a desired match:
... ON t.description ~ a.txt
OR t.additional_info ~ a.txt
.. and get the count with count(DISTINCT a.ord)
, counting each array only once even if multiple search terms match.
Finally, I added the mysterious id
in your result with row_number() OVER (ORDER BY t.id) AS id
- assuming it's supposed to be a serial number. Voilá.
The same considerations for regular expression matches ( ~
) as in your previous question apply:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.