[英]IBM Watson Studio Kernel Error In Spark and Python Environment
[英]stackoverflow error on spark environment on Watson Studio IBM Cloud
我正在關注 IBM Cloud 上 Watson Studio Gallery 的 spark 教程( https://eu-de.dataplatform.cloud.ibm.com/exchange/public/entry/view/99b857815e69353c04d95daefb3b91fa?context=cpdaas )並遇到 Java 堆棧溢出問題:
Py4JJavaError: An error occurred while calling o20418.fit.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task serialization failed: java.lang.StackOverflowError
java.lang.StackOverflowError
at scala.collection.immutable.List$SerializationProxy.writeObject(List.scala:516)
at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1154)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
問題線:
cvModel = crossval.fit(trainingRatings)
問題單元格:
from pyspark.ml.tuning import CrossValidator, ParamGridBuilder
(trainingRatings, validationRatings) = ratings.randomSplit([80.0, 20.0])
evaluator = RegressionEvaluator(metricName='rmse', labelCol='rating', predictionCol='prediction')
paramGrid = ParamGridBuilder().addGrid(als.rank, [1, 5, 10]).addGrid(als.maxIter, [20]).addGrid(als.regParam, [0.05, 0.1, 0.5]).build()
crossval = CrossValidator(estimator=als, estimatorParamMaps=paramGrid, evaluator=evaluator, numFolds=10)
cvModel = crossval.fit(trainingRatings)
predictions = cvModel.transform(validationRatings)
print('The root mean squared error for our model is: {}'.format(evaluator.evaluate(predictions.na.drop())))
使用環境: Default Spark 3.2 & Python 3.9
將不勝感激任何幫助。
我通過向托管筆記本的 VM 添加更多內存解決了這個問題。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.