简体   繁体   中英

How would I return only specific values from a comma separated column entry in HIVEQL?

So I have a table which contains a column where each entry is a comma separated list. See below:

ID class dt
08 A,B,C,D 20220801
21 A,B,C,D,E,F,G 20220801
92 A,B,C 20220801
98 A,B,C,D,E,F,G,H 20220801

In the actual table, some of the class entries contain up to 232 different elements.

I'd like to return only specific values in a new column with the same comma separated format. So as an example from the above, I'd like ID 21 to return a column with 'C,D,E'.

Any help is appreciated.

You may try below query which returns class entries matching with any of C,D,E from each row.

WITH sample_table AS (
  SELECT 08 ID, 'A,B,C,D' class, '20220801' dt UNION ALL
  SELECT 21, 'A,B,C,D,E,F,G', '20220801' UNION ALL
  SELECT 92, 'A,B,C', '20220801' UNION ALL
  SELECT 98, 'A,B,C,D,E,F,G,H', '20220801'
)
SELECT ID, dt, CONCAT_WS(',', collect_list(cls)) clz
  FROM sample_table t LATERAL VIEW EXPLODE(SPLIT(class, ',')) c AS cls
 WHERE cls IN ('C', 'D', 'E')
 GROUP BY ID, dt;
Query results

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM