![](/img/trans.png)
[英]Substract several values from one table to another value in a different table
[英]Hive: substract one table from another?
在Hive中,我有两个表:
'old_books'
title String, author String, year Int, outOfPrint Boolean;
和
'new_books'
title String, author String, year Int;
由于某些错误,创建这些表的人已将一些新标题放入“ old_books”表中。
是否可以仅通过Hive从“旧书”表中减去两个表中都存在的记录?
到目前为止,我只需要通过Hive请求选择两个表中都存在的书:
SELECT old_books.* FROM old_books JOIN new_books ON (old_books.title=new_books.title);
如何从'old_books'继承此请求的结果?
假设您具有Hive 01.3版本或更高版本,则可以使用not exist子句:
select * from old_books a不存在的地方(SELECT 1 FROM old_books b JOIN new_books c ON(b.title = c.title),其中a.book_id = b.book_id);
这是参考: https : //cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries
我发现以下对我有用:
INSERT OVERWRITE TABLE corrected_old_books
SELECT old_books.* FROM old_books left JOIN new_books ON (new_books.title=old_books.title) where new_books.title is NULL;
我使用/usr/lib/hive/lib/hive-hwi-0.13.0.2.1.3.0-563.jar
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.