简体   繁体   中英

java.lang.OutofMemoryError: Java heap space collecting a lot of elements from an rdd in pyspark

I am trying to collect a large number of items from an rdd in pyspark.I get this error java.lang.OutofMemoryError: Java heap space. I think increasing the Java heap space will help.

在此处输入图片说明

I tried the following command java -Xmx2g to increase the java heap space and it did not work.

在此处输入图片说明

Anyone have any other ideas? Thank you!

You can control the amount of memory Spark driver and executor processes can use by setting spark.driver.memory and spark.executor.memory .

For example, you can run Spark like this:

/bin/spark-submit --name "My app" --master local[4] \
    --conf spark.driver.memory=2g \
    --conf spark.executor.memory=2g myApp.jar

You can configure these properties in a few different ways, see the documentation on Spark configuration .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM