[英]Do not how to fix the Problem in GC in spark cluster ,Can anybody explain how to deal with ParOldGen and PSYoungGen?
檢查提供的所有配置,包括shuffle spark.default.parallelism,spark.sql.shuffle.partitions
以及所有必需的內存選項,例如可執行內存和驅動程序內存 。 我有足夠的內存大約64 GB,但是不知道為什么會出現...
我想知道它是否可以通過內存配置來解決。 它執行所有先前的任務,但對31個任務失敗。 查詢的大小很大,對於較小的查詢,它運行良好。
logger.debug(String.format("Executing SQL %s", taskExec));
Dataset<Row> dfTmp = null;
dfTmp = sqlContext.sql(taskExec);
AdaptiveSizeStop: collection: 107
[PSYoungGen: 2917987K->2917987K(3547136K)] [ParOldGen: 8387375K-
>8387375K(8388608K)] 11305363K->11305363K(11935744K), [Metaspace:
72368K->72368K(1114112K)], 0.4457447 secs] [Times: user=2.27 sys=0.00,
real=0.44 secs]
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="kill %p"
# Executing /bin/sh -c "kill 9085"...
282.474: [Full GC (Ergonomics) 282.535: [SoftReference, 0 refs, 0.0000665 secs]282.535: [WeakReference, 956 refs, 0.0001140 secs]282.535: [FinalReference, 1092 refs, 0.0000635 secs]282.535: [PhantomReference, 0 refs, 38 refs, 0.0000145 secs]282.536: [JNI Weak Reference, 0.0000145 secs]AdaptiveSizeStart: 283.597 collection: 108
PSAdaptiveSizePolicy::compute_eden_space_size limits: desired_eden_size: 3086984786 old_eden_size: 3023044608 eden_limit: 3023044608 cur_eden: 2991587328 max_eden_size: 3023044608 avg_young_live: 2739392512
PSAdaptiveSizePolicy::compute_eden_space_size: gc time limit gc_cost: 1.000000 GCTimeLimit: 98
PSAdaptiveSizePolicy::compute_eden_space_size: costs minor_time: 0.144870 major_cost: 0.975053 mutator_cost: 0.000000 throughput_goal: 0.990000 live_space: 11575527424 free_space: 5652873216 old_eden_size: 3023044608 desired_eden_size: 3023044608
PSAdaptiveSizePolicy::compute_old_gen_free_space limits: desired_promo_size: 3143082161 promo_limit: 2629828608 free_in_old_gen: 20183040 max_old_gen_size: 8589934592 avg_old_live: 8569751552
PSAdaptiveSizePolicy::compute_old_gen_free_space: gc time limit gc_cost: 1.000000 GCTimeLimit: 98
PSAdaptiveSizePolicy::compute_old_gen_free_space: costs minor_time: 0.144870 major_cost: 0.975053 mutator_cost: 0.000000 throughput_goal: 0.990000 live_space: 11577579520 free_space: 5652873216 old_promo_size: 2629828608 desired_promo_size: 2629828608
AdaptiveSizeStop: collection: 108
[PSYoungGen: 2921472K->778705K(3547136K)] [ParOldGen: 8387375K->8386929K(8388608K)] 11308847K->9165634K(11935744K), [Metaspace: 72370K->72370K(1114112K)], 1.1228849 secs] [Times: user=8.59 sys=0.74, real=1.12 secs]
10:51:46.868 [Executor task launch worker for task 9593] ERROR org.apache.spark.executor.Executor - Exception in task 30.0 in stage 144.0 (TID 9593)
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at scala.collection.mutable.StringBuilder.append(StringBuilder.scala:200)
at org.apache.spark.sql.catalyst.util.package$$anonfun$sideBySide$1.apply(package.scala:113)
at org.apache.spark.sql.catalyst.util.package$$anonfun$sideBySide$1.apply(package.scala:112)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at org.apache.spark.sql.catalyst.util.package$.sideBySide(package.scala:112)
at org.apache.spark.sql.catalyst.util.package$.sideBySide(package.scala:104)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$5.apply(RuleExecutor.scala:137)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$5.apply(RuleExecutor.scala:138)
at org.apache.spark.internal.Logging$class.logDebug(Logging.scala:58)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.logDebug(RuleExecutor.scala:40)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:134)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:76)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:76)
at org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$$anonfun$canonicalize$1.apply(GenerateUnsafeProjection.scala:354)
at org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$$anonfun$canonicalize$1.apply(GenerateUnsafeProjection.scala:354)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.immutable.List.map(List.scala:285)
at org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.canonicalize(GenerateUnsafeProjection.scala:354)
at org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.generate(GenerateUnsafeProjection.scala:362)
10:51:46.900 [SIGTERM handler] ERROR org.apache.spark.executor.CoarseGrainedExecutorBackend - RECEIVED SIGNAL TERM
10:51:46.918 [Thread-2] INFO org.apache.spark.storage.DiskBlockManager - Shutdown hook called
這是我遇到的奇怪問題,檢查了所有配置詳細信息。 正如我提到的,我有足夠的內存,但是這給了GC內存不足的問題。 因此,我忙於查看內存端配置。
我閱讀了所有與Spark相關的文章。 如何在Spark中處理內存 。我在堆棧溢出中遇到了類似的問題,但是沒有人在他們的問題中有完全相同的錯誤。
在scala 2.13.x版中發現了此類錯誤,因此我懷疑我的TaskServiceImpl是否會引發此異常,但我使用的是2.11.8。
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
at java.lang.StringBuilder.append(StringBuilder.java:136)
我試圖在cmd中檢查它--- >> scala>“ a” .mkString(“,”)。它給了我正確的輸出,所以我確定這不會引起問題,所以我試圖檢查hadoop版本,所以我試圖將其更新為hadoop 5.14.6的最新版本,它像champ一樣工作 。
我使用的是Spark 2.11.8和Scala 2.11.8,而我在CDH的hadoop版本是5.13.X。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.