I have RDBMS table and Queries which are working perfectly. I have offloaded data from RDBMS to HIVE table.To run the existing queries on HIVE, we need first to make them compatible to HIVE.
Let's take below example with sub-query in select item list. It is syntactically valid and working fine on RDBMS system. But It Will not work on HIVE As per HIVE manual , Hive supports subqueries only in the FROM and WHERE clause.
Example 1 :
SELECT t1.one
,(SELECT t2.two
FROM TEST2 t2
WHERE t1.one=t2.two) t21
,(SELECT t3.three
FROM TEST3 t3
WHERE t1.one=t3.three) t31
FROM TEST1 t1 ;
Example 2:
SELECT a.*
, CASE
WHEN EXISTS
(SELECT 1
FROM tblOrder O
INNER JOIN tblProduct P
ON O.Product_id = P.Product_id
WHERE O.customer_id = C.customer_id
AND P.Product_Type IN (2, 5, 6, 9)
)
THEN 1
ELSE 0
END AS My_Custom_Indicator
FROM tblCustomer C
INNER JOIN tblOtherStuff S
ON C.CustomerID = S.CustomerID ;
Example 3 :
Select component_location_id, component_type_code,
( select clv.LOCATION_VALUE
from stg_dev.component_location_values clv
where identifier_code = 'AXLE'
and component_location_id = cl.component_location_id ) as AXLE,
( select clv.LOCATION_VALUE
from stg_dev.component_location_values clv
where identifier_code = 'SIDE'
and component_location_id = cl.component_location_id ) as SIDE
from stg_dev.component_locations cl ;
I want to know the possible alternative of sub-queries in select item list to make it compatible to hive. Apparently I will be able to transform existing queries in HIVE format.
Any help and guidance is highly appreciated !
The query you provided could be transformed to a simple query with LEFT JOIN
s.
SELECT
t1.one, t2.two AS t21, t3.three AS t31
FROM
TEST1 t1
LEFT JOIN TEST2 t2
ON t1.one = t2.two
LEFT JOIN TEST3 t3
ON t1.one = t3.three
Since there is no limitation in the subqueries, the joins will return the same data. (The subqueries should return only one or no row for each row in TEST1.)
Please note, that your original query could not handle 1..n connections. In most DBMS, subqueries in the SELECT list should return only with a resultset with one columns and one or no row.
Based on HIVE manual :
SELECT t1.one, t2.two, t3.three
FROM TEST1 t1,TEST2 t2, TEST3 t3
WHERE t1.one=t2.two AND t1.one=t3.three;
SELECT t1.one,t2.two,t3.three FROM TEST1 t1 INNER
JOIN TEST2 t2 ON t1.one=t2.two INNER JOIN TEST3 t3
ON t1.one=t3.three WHERE t1.one=t2.two AND t1.one=t3.three;
SELECT t1.one,t2.two as t21,t3.three as t31 FROM TEST1 t1
INNER JOIN TEST2 t2 ON t1.one=t2.two
INNER JOIN TEST3 t3 ON t1.one=t3.three
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.