简体   繁体   English

Hive UDF从分布式缓存中获取值,不适用于外部查询

[英]Hive UDF to fetch value from distributed cache not working with outer queries

We have written a Hive UDF in Java to fetch value from file added in distributed cache which works perfectly from a select query like : 我们已经用Java编写了一个Hive UDF,以从分布式缓存中添加的文件中获取值,该文件可以完美地与以下选择查询配合使用:

Query 1. 查询1。

select country_key, MyFunction(country_key,"/data/MyData.txt") as capital from tablename;

But not working when trying to create table from its output. 但是在尝试根据其输出创建表时不起作用。 Like : 喜欢 :

Query 2. 查询2。

 create table new_table 
    as 
    select country_key, MyFunction(country_key,"/data/MyData.txt") as capital from tablename;

It is not even working from outer select. 它甚至不能从外部选择中运行。 Like : 喜欢 :

Query 3. 查询3。

select t.capital from 
(
select country_key, MyFunction(country_key,"/data/MyData.txt") as capital from tablename
) t;

Below is my UDF's evaluate function : 以下是我的UDF的评估函数:

public class CountryMap extends UDF{

    Map<Integer, String> countryMap =  null;

    public String evaluate(Integer keyCol, String mapFile) {


        if (countryMap == null){
            //read comma delimited data from mapFile and build a hashmap
                countryMap.put(key, value);
            }

        if (countryMap.containsKey(keyCol)) {
                return countryMap.get(keyCol);
            }
        return "NA";
    }
}

Adding jar, file and creating Hive temporary function in Hive like: 在jar中添加jar,文件并创建Hive临时功能,例如:

ADD JAR /data/CountryMap-with-dependencies.jar;
ADD FILE /data/MyData.txt;
CREATE TEMPORARY FUNCTION MyFunction as 'CountryMap';

When I run query 1 I get expected value from Map but when I run query 2 and 3 I get 'NA'. 当我运行查询1时,我从Map中获得了期望值,但是当我运行查询2和3时,我得到了“ NA”。 When I returned Map.size() for query 2 and 3 in place of 'NA' it was zero. 当我为查询2和3返回Map.size()代替“ NA”时,它为零。

I am puzzled why outer select or create table is not able to fetch coutryMap() value and why the size of Map becomes zero. 我感到困惑的是,为什么外部选择或创建表无法获取coutryMap()值,为什么Map的大小变为零。

What version of Hive do you use? 您使用什么版本的Hive? Before 0.14.0 you had to set hive.cache.expr.evaluation = false; 在0.14.0之前,您必须set hive.cache.expr.evaluation = false; to get around a bug . 避开一个bug

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM