[英]Hive UDF to fetch value from distributed cache not working with outer queries
We have written a Hive UDF in Java to fetch value from file added in distributed cache which works perfectly from a select query like : 我们已经用Java编写了一个Hive UDF,以从分布式缓存中添加的文件中获取值,该文件可以完美地与以下选择查询配合使用:
Query 1. 查询1。
select country_key, MyFunction(country_key,"/data/MyData.txt") as capital from tablename;
But not working when trying to create table from its output. 但是在尝试根据其输出创建表时不起作用。 Like :
喜欢 :
Query 2. 查询2。
create table new_table
as
select country_key, MyFunction(country_key,"/data/MyData.txt") as capital from tablename;
It is not even working from outer select. 它甚至不能从外部选择中运行。 Like :
喜欢 :
Query 3. 查询3。
select t.capital from
(
select country_key, MyFunction(country_key,"/data/MyData.txt") as capital from tablename
) t;
Below is my UDF's evaluate function : 以下是我的UDF的评估函数:
public class CountryMap extends UDF{
Map<Integer, String> countryMap = null;
public String evaluate(Integer keyCol, String mapFile) {
if (countryMap == null){
//read comma delimited data from mapFile and build a hashmap
countryMap.put(key, value);
}
if (countryMap.containsKey(keyCol)) {
return countryMap.get(keyCol);
}
return "NA";
}
}
Adding jar, file and creating Hive temporary function in Hive like: 在jar中添加jar,文件并创建Hive临时功能,例如:
ADD JAR /data/CountryMap-with-dependencies.jar;
ADD FILE /data/MyData.txt;
CREATE TEMPORARY FUNCTION MyFunction as 'CountryMap';
When I run query 1 I get expected value from Map but when I run query 2 and 3 I get 'NA'. 当我运行查询1时,我从Map中获得了期望值,但是当我运行查询2和3时,我得到了“ NA”。 When I returned Map.size() for query 2 and 3 in place of 'NA' it was zero.
当我为查询2和3返回Map.size()代替“ NA”时,它为零。
I am puzzled why outer select or create table is not able to fetch coutryMap() value and why the size of Map becomes zero. 我感到困惑的是,为什么外部选择或创建表无法获取coutryMap()值,为什么Map的大小变为零。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.