简体   繁体   English

Hive - 内存不足异常 - Java堆空间

[英]Hive - Out of Memory Exception - Java Heap Space

I am running a Hive insert on top of parquet files(created using Spark). 我在镶木地板文件(使用Spark创建)上运行Hive插入。 Hive insert is using partitioned by clause. Hive插入使用partitioned by子句。 But at the end when the screen is printing messages like "Loading partition {=xyz, =123, =abc} a Java Heap Space exception is coming. 但最后当屏幕正在打印“加载分区{= xyz,= 123,= abc}等消息时,Java堆空间异常即将到来。

java.lang.OutOfMemoryError: Java heap space
         at java.util.HashMap.createEntry(HashMap.java:901)
         at java.util.HashMap.addEntry(HashMap.java:888)
         at java.util.HashMap.put(HashMap.java:509)
         at org.apache.hadoop.hive.metastore.api.Partition.<init>(Partition.java:229)
         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.deepCopy(HiveMetaStoreClient.java:1356)
         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartitionWithAuthInfo(HiveMetaStoreClient.java:1003)
         at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at java.lang.reflect.Method.invoke(Method.java:606)
         at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
         at com.sun.proxy.$Proxy9.getPartitionWithAuthInfo(Unknown Source)
         at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1611)
         at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1565)
         at org.apache.hadoop.hive.ql.exec.StatsTask.getPartitionsList(StatsTask.java:403)
         at org.apache.hadoop.hive.ql.exec.StatsTask.aggregateStats(StatsTask.java:150)
         at org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:117)
         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
         at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1508)
         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1275)
         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1093)
         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:916)
         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:906)
         at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:359)
         at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:456)
         at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:466)
         at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:748)
         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)

I have set following property while running the job, and have tried to change the values to higher and lower, but every time in the end I find this error. 我在运行作业时设置了以下属性,并尝试将值更改为更高和更低,但每次最后我都会发现此错误。

Properties toggled: 切换的属性:

 set mapred.map.tasks=100;
 set mapred.reduce.tasks=100;
 set mapreduce.map.java.opts=-Xmx4096m;
 set mapreduce.reduce.java.opts=-Xmx4096m;
 set hive.exec.max.dynamic.partitions.pernode=100000;
 set hive.exec.max.dynamic.partitions=100000;

Please suggest what is going wrong here. 请告诉我这里出了什么问题。 Hive version is 0.13. Hive版本是0.13。

hive-env.sh hive-env.sh

 if [ "$SERVICE" = "cli" ]; then
   if [ -z "$DEBUG" ]; then
     export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms12288m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParNewGC -XX:-UseGCOverheadLimit"
   else
     export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms12288m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit"
   fi
 fi

# The heap size of the jvm stared by hive shell script can be controlled via:
#
export HADOOP_HEAPSIZE=4096

It possibly related to HIVE-10149 issue. 它可能与HIVE-10149问题有关。 Try to set hive.optimize.sort.dynamic.partition to true . 尝试将hive.optimize.sort.dynamic.partition设置为true

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM