简体   繁体   English

在 Hive 中连接然后分组

[英]Concat and then group by in Hive

I have 3 columns in a table as given below:我在一个表中有 3 列,如下所示:

|---------------------|------------------|-------------|
|      dept           |     class        |    item     |
|---------------------|------------------|-------------|
|          234        |         34       |      6783   |
|---------------------|------------------|-------------|
|          784        |         78       |      2346   |
|---------------------|------------------|-------------|

while I'm concatenating 3 columns and creating a column as 'item_no' (value 234-34-6783), it throws an error when I'm using the new column item_no in group by function - 'Invalid table alias or column reference' Could someone help me with this?虽然我正在连接 3 列并将列创建为“item_no”(值 234-34-6783),但当我使用 function 组中的新列 item_no 时会引发错误 - '无效的表别名或列引用'有人可以帮我解决这个问题吗?

select dept, class, item, concat(dept, '-', class, '-', item) as item_no, sum(sales)
from sales_table
group by dept, class, item, item_no;

column data types are smallint列数据类型是 smallint

Here are two methods:这里有两种方法:

select concat(dept, '-', class, '-', item) as item_no, count(*)
from t
group by concat(dept, '-', class, '-', item) ;

Or:或者:

select concat(dept, '-', class, '-', item) as item_no, count(*)
from t
group by dept, class, item ;

That said, I thought Hive supported aliases in group by , so this should also work:也就是说,我认为 Hive 支持group by中的别名,所以这也应该有效:

select concat(dept, '-', class, '-', item) as item_no, count(*)
from t
group by item_no ;

This would not work if item_no were a column in the table, though.但是,如果item_no是表中的一列,这将不起作用。 And positional notation also works:位置符号也可以:

select concat(dept, '-', class, '-', item) as item_no, count(*)
from t
group by 1 ;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM