简体   繁体   English

通过shell脚本执行Hive udf

[英]hive udf execution via shell script

I have a Hive Udf that works well in hive terminal, What I want i want to execute it via shell script. 我有一个在Hive终端中运行良好的Hive Udf,我想要我想通过shell脚本执行它。 On hive terminal i am able to execute following commands : 在蜂巢终端上,我能够执行以下命令:

use mashery_db;
add jar hdfs://nameservice1/tmp/nextdata_aggregations/custom_jar/readerCheck.jar;
add file hdfs://nameservice1/tmp/GeoLite2-City.mmdb;
CREATE TEMPORARY FUNCTION geoip AS 'com.mashery.nextdata.hive.udf.GeoIPGenericUDF';

But when I am adding the above code in shell script 但是当我在shell脚本中添加以上代码时

hive -e "use mashery_db;"
hive -e "add jar hdfs://nameservice1/tmp/nextdata_aggregations/custom_jar/readerCheck.jar;"
hive -e "add file hdfs://nameservice1/tmp/GeoLite2-City.mmdb;"
hive -e "CREATE TEMPORARY FUNCTION geoip AS 'com.mashery.nextdata.hive.udf.GeoIPGenericUDF';"

The 1st 'hive -e' works well and adds the jar but the last one create temporary function doesn't work. 第一个'hive -e'可以很好地工作并添加了jar,但是最后一个创建临时功能不起作用。 I am getting below error: 我得到以下错误:

FAILED: ParseException line 1:35 mismatched input 'com' expecting StringLiteral near 'AS' in create function statement

I have also tried with single quotes 我也试过单引号

hive -e "CREATE TEMPORARY FUNCTION geoip AS 'com.mashery.nextdata.hive.udf.GeoIPGenericUDF';"

then I am getting FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask 然后我得到FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask

FAILED: Class com.mashery.nextdata.hive.udf.GeoIPGenericUDF not found

Does hive Udf supports shell script ,if it does whats wrong I am doing. 蜂巢Udf是否支持shell脚本,如果它做错了我在做什么。 Thanks in advance 提前致谢

Each invocation of hive -e spawns a new process with a new hive shell that has no memory of what the previous one did, so hive 'forgets' where the UDF is... One solution is to chain them in just one command, but it's better form to put all your hive commands in a file (for instance "commands.hql") and use hive -f commands.hql instead of -e . 每次对hive -e调用都会产生一个带有新的hive shell的新进程,该进程不会存储以前的内容,因此hive会“忘记” UDF所在的位置...一种解决方案是将它们链接在一个命令中,但是最好将所有hive命令放在一个文件中(例如“ commands.hql”),并使用hive -f commands.hql 而不是 -e

File would look like this: 文件如下所示:

use mashery_db;
add jar hdfs://nameservice1/tmp/nextdata_aggregations/custom_jar/readerCheck.jar;
add file hdfs://nameservice1/tmp/GeoLite2-City.mmdb;
CREATE TEMPORARY FUNCTION geoip AS 'com.mashery.nextdata.hive.udf.GeoIPGenericUDF';"

You can get this to work with both hive -e and hive -f : 您可以hive -ehive -f hive -ehive -f

hive -e "use mashery_db;
add jar hdfs://nameservice1/tmp/nextdata_aggregations/custom_jar/readerCheck.jar;
add file hdfs://nameservice1/tmp/GeoLite2-City.mmdb;
CREATE TEMPORARY FUNCTION geoip AS 'com.mashery.nextdata.hive.udf.GeoIPGenericUDF';"

Creating them as a file and using hive -f hive_file.hql would work as well. 将它们创建为文件并使用hive -f hive_file.hql也可以正常工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM