简体   繁体   English

HiveQL相当于SQL中的!>

[英]HiveQL equivalent of !> in SQL

I have currently been trying to extract those values from a table that do not exist in another table. 我目前一直在尝试从另一个表中不存在的表中提取这些值。 However, as the joining value contains null values - the not in, not exists and left join option do not seem to be working. 但是,由于联接值包含空值-不存在,不存在和左联接选项似乎不起作用。

Therefore, is there a way to apply the 'not greater than' condition in the HiveQL? 因此,是否有办法在HiveQL中应用“不大于”条件?

For reference, this is the query that I ran, and similarly with not exists and left join .. 供参考,这是我运行的查询,类似地,该查询不存在并保留了联接。

with date_prob as 
(
    select distinct visit 
    from t1
    where dt=20161124
    and dt1!=orig_ts
),

ev_data as
(
    select distinct visit 
    from t1
    where dt=20161124
    and visit is not null
    and origts is not null 
    and uid is not null
), 

fin_data as 
(
    select x.visit 
    from ev_data x
    where x.visit not in 
    (
      select distinct visit 
      from date_prob
      and visit is not null
    ) 
)

The query that I ran for a left join - 我为左联接运行的查询-

with date_prob as 
(
    select distinct id
    from t1
    where dt1='2016-11-24'
    and dt1!=orig_ts
    and (datediff(dt1,orig_ts) not in ('1','-1'))
),

ev_data as
(
    select distinct id
    from t1
    where dt1='2016-11-24'
    and id is not null
)

select x.id

from ev_data x
left join date_prob y

where y.id is null
;

The Data Example - 数据示例-

id        dt1           orig_ts
1     2016-11-24       2016-11-10
2     2016-11-24       2016-11-24 
3     2016-11-24       2010-01-01
4     2016-11-24       2017-01-01
5     2016-11-24       2016-11-24
6     2016-11-24       2016-11-25
7     2016-11-23       2016-11-23 

Therefore, from this table I want to remove those Id's where there is greater than a difference of a day. 因此,我想从该表中删除那些相差一天以上的ID。 Thus, the query should return values only where the ID is equal to 2,5 and 6. 因此,查询应仅在ID等于2,5和6的情况下返回值。

If you want to extract those values from a table that do not exist in another table than you can use left join and filter where second_table_key is null . 如果要从另一个表中不存在的表中提取那些值,则可以使用left join和filter where second_table_key is null This will work even there are NULLs in keys: 即使键中包含NULL,这也将起作用:

--this query will return records from table a that do not exist in b
select a.id
  from a left join b on a.id=b.id
 where b.id is null; --only not joined

Have fixed your example. 已修正您的示例。 it works: 有用:

drop table if exists t1;
create table t1 (id int,dt1 string,           orig_ts string );
insert overwrite table t1
select 1 id,    '2016-11-24' dt1,       '2016-11-10' orig_ts union all
select 2 id,    '2016-11-24' dt1,       '2016-11-24' orig_ts union all 
select 3 id,    '2016-11-24' dt1,       '2010-01-01' orig_ts union all
select 4 id,    '2016-11-24' dt1,       '2017-01-01' orig_ts union all
select 5 id,    '2016-11-24' dt1,       '2016-11-24' orig_ts union all
select 6 id,    '2016-11-24' dt1,       '2016-11-25' orig_ts union all
select 7 id,    '2016-11-23' dt1,       '2016-11-23' orig_ts;

with date_prob as 
(
    select distinct id
    from t1
    where dt1='2016-11-24'
    and dt1!=orig_ts
    and (datediff(dt1,orig_ts) not in ('1','-1'))
),

ev_data as
(
    select distinct id
    from t1
    where dt1='2016-11-24'
    and id is not null
)

select x.id
from ev_data x
left join date_prob y on x.id=y.id
where y.id is null
;

OK
2
5
6
Time taken: 14.166 seconds, Fetched: 3 row(s)
hive>

Works as expected 按预期工作

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM