简体   繁体   English

减少许多Hive -e调用的Hive启动时间

[英]Reduce hive startup time for many hive -e invocations

I am invoking hive -e hundreds of times from the command line in this way: 我以这种方式从命令行调用hive -e数百次:

cat hive_script.hql | parallel --gnu hive -e '{}' 

where each line in hive_script.hql can run independently and in any order. hive_script.hql中的每一行都可以独立且以任何顺序运行。

Are there any --hiveconf parameters that can reduce the start up time? 是否有任何--hiveconf参数可以减少启动时间? The Apache web page seems to suggest there might be at Apache网页似乎暗示可能存在

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+VariableSubstitution https://cwiki.apache.org/confluence/display/Hive/LanguageManual+VariableSubstitution

"This is frustrating as Hive becomes closely coupled with scripting languages. The Hive startup time of a couple seconds is non-trivial when doing thousands of manipulations such as multiple hive -e invocations."

You can't speed hive -e up but you can put multiple queries in one script. 您无法加快hive -e的速度,但可以在一个脚本中放置多个查询。

If that doesn't work you will need to look at HiveServer2 and invoking queries from a JDBC client. 如果这不起作用,则需要查看HiveServer2并从JDBC客户端调用查询。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM