I hope you can help. I have the below query, which has a case statement.
I want to say:
IF the domain is in the other table, then return the domain name, else, mark it as 'other'
I am using Hive & get the error:
Unsupported SubQuery Expression 'cleandomain': Currently SubQuery expressions are only allowed as Where Clause predicates
Is there some other way I can achieve the same?
SELECT *,
CASE
WHEN cleandomain IN (SELECT cleandomain
FROM keenek1.daily_top_doms) THEN cleandomain
ELSE 'other'
END AS status
FROM (SELECT hour,.....
One possible solution is using in_file(string str, string filename) function.
Put the list of domains in the text file, one domain per line, txt file and call in_file
function in the CASE statement:
CASE
WHEN in_file(cleandomain,'file/path/daily_top_doms.txt') THEN cleandomain
ELSE 'other'
END AS status
Another solution is to aggregate the list of domains into array in the subquery, join using cross join and use array_contains(). This may work much faster if the list is not too big:
with dom as (
SELECT collect_set(cleandomain) dom
FROM keenek1.daily_top_doms
)
select
case when array_contains(d.dom, s.cleardomain) then s.cleandomain
else 'other'
end as status
from (your query) s cross join dom d --one row cross join
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.