Hive query getting stuck in the middle of execution

Question

Colleagues,

I'm facing with an issue while execution of bash file with sql script in hive - it always stuck at the same place map=100%, reduce=67%

I tried to play with mappers and reducers number with different variations and other tunning characteristics:

SET hive.exec.parallel=true;
SET hive.default.fileformat=RCFILE;
SET hive.stats.autogather=false;
SET hive.exec.compress.output=true; 
SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec; 
SET mapred.output.compression.type=BLOCK;
SET hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;

But result still the same - it continuously return 67% of reducing progress.

Hive version - 0.13.0.2.XXX

SQL contains table creation from several source - simple join of several tables.

Any ideas how to tune this?

Any help will be appreciated.

Answer 1

The phases of a Reducer are:

Shuffle
Sort
Reduce

Getting stuck @ 67% indicates that the Shuffle and Sort have completed but none of your partitions are able to succeed in the Reduce phase. The Reduce phase is your actual Reducer code. This indicates your code is unable to complete. You should examine your code and also look at the hive logs to see why your code is unable to be run.

Answer 2

因此，发现有多个重复的记录，这就是执行被卡住的原因。

Hive query getting stuck in the middle of execution

Question

2 answers

solution1
2 2014-12-13 14:28:54

solution2
0 ACCPTED 2015-01-31 01:32:30

Hive query getting stuck in the middle of execution

Question

2 answers

solution1 2 2014-12-13 14:28:54

solution2 0 ACCPTED 2015-01-31 01:32:30

solution1
2 2014-12-13 14:28:54

solution2
0 ACCPTED 2015-01-31 01:32:30