[英]HiveQL equivalent of !> in SQL
I have currently been trying to extract those values from a table that do not exist in another table. 我目前一直在尝试从另一个表中不存在的表中提取这些值。 However, as the joining value contains null values - the not in, not exists and left join option do not seem to be working.
但是,由于联接值包含空值-不存在,不存在和左联接选项似乎不起作用。
Therefore, is there a way to apply the 'not greater than' condition in the HiveQL? 因此,是否有办法在HiveQL中应用“不大于”条件?
For reference, this is the query that I ran, and similarly with not exists and left join .. 供参考,这是我运行的查询,类似地,该查询不存在并保留了联接。
with date_prob as
(
select distinct visit
from t1
where dt=20161124
and dt1!=orig_ts
),
ev_data as
(
select distinct visit
from t1
where dt=20161124
and visit is not null
and origts is not null
and uid is not null
),
fin_data as
(
select x.visit
from ev_data x
where x.visit not in
(
select distinct visit
from date_prob
and visit is not null
)
)
The query that I ran for a left join - 我为左联接运行的查询-
with date_prob as
(
select distinct id
from t1
where dt1='2016-11-24'
and dt1!=orig_ts
and (datediff(dt1,orig_ts) not in ('1','-1'))
),
ev_data as
(
select distinct id
from t1
where dt1='2016-11-24'
and id is not null
)
select x.id
from ev_data x
left join date_prob y
where y.id is null
;
The Data Example - 数据示例-
id dt1 orig_ts
1 2016-11-24 2016-11-10
2 2016-11-24 2016-11-24
3 2016-11-24 2010-01-01
4 2016-11-24 2017-01-01
5 2016-11-24 2016-11-24
6 2016-11-24 2016-11-25
7 2016-11-23 2016-11-23
Therefore, from this table I want to remove those Id's where there is greater than a difference of a day. 因此,我想从该表中删除那些相差一天以上的ID。 Thus, the query should return values only where the ID is equal to 2,5 and 6.
因此,查询应仅在ID等于2,5和6的情况下返回值。
If you want to extract those values from a table that do not exist in another table than you can use left join
and filter where second_table_key is null
. 如果要从另一个表中不存在的表中提取那些值,则可以使用
left join
和filter where second_table_key is null
。 This will work even there are NULLs in keys: 即使键中包含NULL,这也将起作用:
--this query will return records from table a that do not exist in b
select a.id
from a left join b on a.id=b.id
where b.id is null; --only not joined
Have fixed your example. 已修正您的示例。 it works:
有用:
drop table if exists t1;
create table t1 (id int,dt1 string, orig_ts string );
insert overwrite table t1
select 1 id, '2016-11-24' dt1, '2016-11-10' orig_ts union all
select 2 id, '2016-11-24' dt1, '2016-11-24' orig_ts union all
select 3 id, '2016-11-24' dt1, '2010-01-01' orig_ts union all
select 4 id, '2016-11-24' dt1, '2017-01-01' orig_ts union all
select 5 id, '2016-11-24' dt1, '2016-11-24' orig_ts union all
select 6 id, '2016-11-24' dt1, '2016-11-25' orig_ts union all
select 7 id, '2016-11-23' dt1, '2016-11-23' orig_ts;
with date_prob as
(
select distinct id
from t1
where dt1='2016-11-24'
and dt1!=orig_ts
and (datediff(dt1,orig_ts) not in ('1','-1'))
),
ev_data as
(
select distinct id
from t1
where dt1='2016-11-24'
and id is not null
)
select x.id
from ev_data x
left join date_prob y on x.id=y.id
where y.id is null
;
OK
2
5
6
Time taken: 14.166 seconds, Fetched: 3 row(s)
hive>
Works as expected 按预期工作
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.