简体   繁体   English

当我使用分组集(多维数据集)时,为什么不能通过stmt来过滤grouping__id?

[英]why does grouping__id can not be filter by having stmt when I use grouping sets(cube rollup)?

hive code as follows: 蜂巢代码如下:

set mapred.reduce.tasks = 100;
create table order_dimensions_cube as
select
        grouping__id as groupid,
        user_level             ,
        city_level             ,
        region_name            ,
        province_name          ,
        city_name              ,
        platform               ,
        sale_type              ,
        item_first_cate_name   ,
        app_module             ,
        department             ,
        sum(COALESCE(complete_sum, 0)) as complete_price
from
        data
group by
        user_level          ,
        city_level          ,
        region_name         ,
        province_name       ,
        city_name           ,
        platform            ,
        sale_type           ,
        item_first_cate_name,
        app_module          ,
        department
with cube having grouping__id >= 704;

this turns out that no records generated. 事实证明,没有记录生成。

more info: 更多信息:

  1. I checked that I have a lot of records in table:data. 我检查了table:data中是否有很多记录。
  2. I have tried this sql without the having stmt and there is alot records generated. 我已经尝试过此SQL而没有stmt,并且生成了很多记录。

why this happens and how to solve this if I want to use having to do some constraints on the result? 为什么会发生这种情况?如果我想对结果做一些约束,该如何解决呢?

thank you. 谢谢。

since you did not provide actual data, please try the following: 由于您未提供实际数据,请尝试以下操作:

select grouping_id,count(*) from 
(select
        grouping__id as groupid,
        user_level             ,
        city_level             ,
        region_name            ,
        province_name          ,
        city_name              ,
        platform               ,
        sale_type              ,
        item_first_cate_name   ,
        app_module             ,
        department             ,
        sum(COALESCE(complete_sum, 0)) as complete_price
from
        data
group by
        user_level          ,
        city_level          ,
        region_name         ,
        province_name       ,
        city_name           ,
        platform            ,
        sale_type           ,
        item_first_cate_name,
        app_module          ,
        department 
with cube) A
group by grouping_id

and see how many records are there for each grouping__id. 并查看每个分组__id有多少条记录。 there could be some issues there. 那里可能有一些问题。 also - try changing the outer query to 也-尝试将外部查询更改为

select * from 
(select
        grouping__id as groupid,
        user_level             ,
        city_level             ,
        region_name            ,
        province_name          ,
        city_name              ,
        platform               ,
        sale_type              ,
        item_first_cate_name   ,
        app_module             ,
        department             ,
        sum(COALESCE(complete_sum, 0)) as complete_price
from
        data
group by
        user_level          ,
        city_level          ,
        region_name         ,
        province_name       ,
        city_name           ,
        platform            ,
        sale_type           ,
        item_first_cate_name,
        app_module          ,
        department 
with cube) A
where grouping__id >= 704 

and see if problem persists.. 看看问题是否仍然存在。

this is not a solution but more of a trial to understand what goes 这不是解决方案,而是更多的尝试来了解发生了什么

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM