简体   繁体   中英

Hive Insert works fine from Hive CLI but fails from terminal

I currently have the following situation:

I have a shellscript that creates two tables and then fills one of them with data of the other.

my script looks somewhat like this:

    hive -e "CREATE EXTERNAL TABLE table1 ... ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/input/'"
    hive -e "CREATE EXTERNAL TABLE table2 ... PARTITIONED BY ..."
    hive -e "WITH data AS (SELECT date, ...) FROM data INSERT OVERWRITE TABLE table2 PARTITION(part_date) SELECT ... date"

and then some more shellscript that selects certain data from table2. I use shellscript because there is some logic that has to be applied before I can do the selects on table2 .

The script runs without error, the tables are created, table1 has data in it but table2 ends up empty. For some reason it works when I use a very small test dataset, but as soon as the dataset becomes bigger (>1GB) table2 is empty.

If I run the very same commands from Hives CLI manually everything works fine and table2 has the expected data in it.

Why does this happen, and how could i resolve this?

Your shell script commands must be executed sequentially to get data in table2 .

Try this in your shell script:

hive -e "your first query" && 
hive -e "your second query" &&
hive -e "your third query"

This should execute your hive queries one after another. (2nd query waits for 1st to finish and 3rd query waits for both 1st and 2nd to finish)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM