簡體   English   中英

Hive - 內存不足異常 - Java堆空間

[英]Hive - Out of Memory Exception - Java Heap Space

我在鑲木地板文件(使用Spark創建)上運行Hive插入。 Hive插入使用partitioned by子句。 但最后當屏幕正在打印“加載分區{= xyz,= 123,= abc}等消息時,Java堆空間異常即將到來。

java.lang.OutOfMemoryError: Java heap space
         at java.util.HashMap.createEntry(HashMap.java:901)
         at java.util.HashMap.addEntry(HashMap.java:888)
         at java.util.HashMap.put(HashMap.java:509)
         at org.apache.hadoop.hive.metastore.api.Partition.<init>(Partition.java:229)
         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.deepCopy(HiveMetaStoreClient.java:1356)
         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartitionWithAuthInfo(HiveMetaStoreClient.java:1003)
         at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at java.lang.reflect.Method.invoke(Method.java:606)
         at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
         at com.sun.proxy.$Proxy9.getPartitionWithAuthInfo(Unknown Source)
         at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1611)
         at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1565)
         at org.apache.hadoop.hive.ql.exec.StatsTask.getPartitionsList(StatsTask.java:403)
         at org.apache.hadoop.hive.ql.exec.StatsTask.aggregateStats(StatsTask.java:150)
         at org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:117)
         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
         at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1508)
         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1275)
         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1093)
         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:916)
         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:906)
         at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:359)
         at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:456)
         at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:466)
         at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:748)
         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)

我在運行作業時設置了以下屬性,並嘗試將值更改為更高和更低,但每次最后我都會發現此錯誤。

切換的屬性:

 set mapred.map.tasks=100;
 set mapred.reduce.tasks=100;
 set mapreduce.map.java.opts=-Xmx4096m;
 set mapreduce.reduce.java.opts=-Xmx4096m;
 set hive.exec.max.dynamic.partitions.pernode=100000;
 set hive.exec.max.dynamic.partitions=100000;

請告訴我這里出了什么問題。 Hive版本是0.13。

hive-env.sh

 if [ "$SERVICE" = "cli" ]; then
   if [ -z "$DEBUG" ]; then
     export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms12288m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParNewGC -XX:-UseGCOverheadLimit"
   else
     export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms12288m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit"
   fi
 fi

# The heap size of the jvm stared by hive shell script can be controlled via:
#
export HADOOP_HEAPSIZE=4096

它可能與HIVE-10149問題有關。 嘗試將hive.optimize.sort.dynamic.partition設置為true

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM