简体   繁体   中英

Package or automating execution of Hive queries

In Oracle or other DBs, we have a concept of PL/SQL package where we can package multiple queries/procedures and call them inside a UNIX script. In case of Hive queries, what's the process used to package and automate the query processing in actual production environments.

If you are looking to automate the execution of numerous Hive queries, the hive or beeline CLI (think sqlplus with Oracle) allows you to pass a file containing one or more commands such as multiple inserts, select, create tables, etc. The contents of said file can be created programmatically using your favorite scripting language like python or shell.

See the "-i" option in this documentation: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli

In terms of a procedural language, please see: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=59690156

HPL/SQL does have a Create Package option but if whatever you are trying to achieve is scripted outside of HPL/SQL (eg python, shell), you can 'package' your application in accordance with scripting best practices of your selected language.

To run mutilpe queries simply write it down one after another in a file (say 'hivescript.hql') and then it can be run from bash by simply calling it through beeline or hive shell

beeline -u "jdbc:hive2://HOST_NAME:10000/DB" -f hivescript.hql

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM