[英]How to run a query on the view in Hive?
我們正在視圖上運行一個簡單的select(包含大量數據),並且收到“超出了GC開銷限制,發生了內存不足錯誤。我們要運行此查詢,以便在該視圖頂部運行的報表可以工作。它在Tez上運行。
查詢運行4個小時以上,但失敗。 有什么辦法可以像某些設置選項一樣運行此查詢?
詢問
select * from inc_cts.v_report_pub_view;
錯誤信息 -
TaskAttempt 0 failed, info=
» Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Reduce operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Reduce operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:204)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:389)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)
at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:182)
... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:387)
... 20 more
Caused by: java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.hive.ql.exec.persistence.FlatRowContainer.listRealloc(FlatRowContainer.java:259)
at org.apache.hadoop.hive.ql.exec.persistence.FlatRowContainer.add(FlatRowContainer.java:86)
at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.putRow(HashMapWrapper.java:133)
at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:211)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:310)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:179)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:175)
at org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
at org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
... 4 more
TaskAttempt 1 killed
TaskAttempt 2 killed
TaskAttempt 3 failed, info=
» Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Reduce operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at or
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:389)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)
at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:182)
... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:387)
... 20 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.hadoop.hive.ql.exec.persistence.FlatRowContainer.listRealloc(FlatRowContainer.java:259)
at org.apache.hadoop.hive.ql.exec.persistence.FlatRowContainer.add(FlatRowContainer.java:86)
at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.putRow(HashMapWrapper.java:133)
at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:211)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:310)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:179)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:175)
at org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
at org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
... 4 more
TaskAttempt 4 killed
TaskAttempt 5 failed, info=
» Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Reduce operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Reduce operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:204)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:389)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)
at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:182)
... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:387)
... 20 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.hadoop.hive.ql.exec.persistence.FlatRowContainer.listRealloc(FlatRowContainer.java:259)
at org.apache.hadoop.hive.ql.exec.persistence.FlatRowContainer.add(FlatRowContainer.java:86)
at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.putRow(HashMapWrapper.java:133)
at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:211)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:310)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:179)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:175)
at org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
at org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
... 4 more
TaskAttempt 6 failed, info=
» Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Reduce operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Reduce operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:204)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:389)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)
at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:182)
... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:387)
... 20 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.hadoop.hive.ql.exec.persistence.FlatRowContainer.listRealloc(FlatRowContainer.java:259)
at org.apache.hadoop.hive.ql.exec.persistence.FlatRowContainer.add(FlatRowContainer.java:86)
at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.putRow(HashMapWrapper.java:133)
at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:211)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:310)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:179)
at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:175)
at org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
at org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
... 4 more
根據日志,異常是OutOfMemoryError: GC overhead limit exceeded
MapJoin HashTableLoader中OutOfMemoryError: GC overhead limit exceeded
了OutOfMemoryError: GC overhead limit exceeded
。
檢查您當前的設置並相應增加:
set hive.tez.container.size=4096MB;
set hive.auto.convert.join.noconditionaltask.size=1370MB --recommended one third of container size
嘗試使用內存優化的哈希表:
set hive.mapjoin.optimized.hashtable=true;
set hive.mapjoin.optimized.hashtable.wbsize=10485760; --Default Value (10 * 1024 * 1024)
--Optimized hashtable uses a chain of buffers to store data. This is one buffer size.
最后,如果沒有幫助,可以關閉mapjoin:
set hive.auto.convert.join=false;
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.