简体   繁体   中英

Reduce hive startup time for many hive -e invocations

I am invoking hive -e hundreds of times from the command line in this way:

cat hive_script.hql | parallel --gnu hive -e '{}' 

where each line in hive_script.hql can run independently and in any order.

Are there any --hiveconf parameters that can reduce the start up time? The Apache web page seems to suggest there might be at

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+VariableSubstitution

"This is frustrating as Hive becomes closely coupled with scripting languages. The Hive startup time of a couple seconds is non-trivial when doing thousands of manipulations such as multiple hive -e invocations."

You can't speed hive -e up but you can put multiple queries in one script.

If that doesn't work you will need to look at HiveServer2 and invoking queries from a JDBC client.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM