如何从Hive中的表中获取组合值

Question

Have a table in Hive with a following structure: 在Hive中有一个具有以下结构的表：

 col1 col2 col3 col4 col5 col6
 -----------------------------
 AA   NM   ER   NER  NER  NER
 AA   NM   NER  ERR  NER  NER
 AA   NM   NER  NER  TER  NER
 AA   NM   NER  NER  NER  ERY

Wrote a query to fetch the record from the table: 编写查询以从表中获取记录：

Select distinct(col1),col2, array(concat(
CASE WHEN col3=='ER'  THEN 'ER' 
     WHEN col4=='ERR' THEN 'ERR'
     WHEN col5=='TER' THEN 'TER'
     WHEN col6=='ERY' THEN 'ERY'
ELSE 'NER' END

but its not working. 但它不起作用。 Not getting how to go about it. 没有得到如何去做。

Expected O/P: 预期O / P：

col1 col2 col3
--------------
AA  NM    ['ER','ERR','TER','ERY']

Any suggestion/hint will be really helpful. 任何建议/提示将非常有帮助。

Answer 1

You can obatin a string that seems an array using concat_ws 您可以使用concat_ws使字符串看起来像数组

Select distinct(col1),col2,concat_ws('','[',
            concat_ws('', "'",col3,"',", "'",col4,"',","'",col5,"',","'",col6,"'"), 
            ']')
from  my_table

Answer 2

Please try below - 请尝试以下-

select col1, col2, array(
max(CASE WHEN col3=='ER'  THEN 'ER' else '' end),
max(CASE WHEN col4=='ERR' THEN 'ERR' else '' end),
max(CASE WHEN col5=='TER' THEN 'TER' else '' end), 
max(CASE WHEN col6=='ERY' THEN 'ERY' else '' end))
from table
group by col1, col2

Answer 3

This is a big complicated. 这很复杂。 I think that simply unpivoting is the simplest solution: 我认为，最简单的解决方法是：

select col1, col2, collect_set(col)
from ((select col1, col2, col3 as col from t
      ) union  -- intentional to remove duplicates
      (select col1, col2, col4 as col from t
      ) union  -- intentional to remove duplicates
      (select col1, col2, col5 as col from t
      ) union  -- intentional to remove duplicates
      (select col1, col2, col6 as col from t
      )
     ) t
where col is not null
group by col1, col2;

如何从Hive中的表中获取组合值

问题描述

3 个解决方案

解决方案1
1 2019-02-14 09:10:18

解决方案2
1 已采纳 2019-02-14 09:18:19

解决方案3
0 2019-02-14 12:24:50

如何从Hive中的表中获取组合值

问题描述

3 个解决方案

解决方案1 1 2019-02-14 09:10:18

解决方案2 1 已采纳 2019-02-14 09:18:19

解决方案3 0 2019-02-14 12:24:50

解决方案1
1 2019-02-14 09:10:18

解决方案2
1 已采纳 2019-02-14 09:18:19

解决方案3
0 2019-02-14 12:24:50