I am using configuration file according to guides Configure Spark to setup EMR configuration on AWS, for example, changing the spark.executor.extraClassPath
is via the following settings:
{
"Classification": "spark-defaults",
"Properties": {
"spark.executor.extraClassPath": "/home/hadoop/mongo-hadoop-spark.jar",
}
}
It works prefect and do change spark.executor.extraClassPath
on emr spark conf, but emr has some preset default paths in spark.executor.extraClassPath
, so instead of overwriting the spark.executor.extraClassPath
.I would like to know if there is a way to append the path and keep the default paths such as
{
"Classification": "spark-defaults",
"Properties": {
"spark.executor.extraClassPath": "{$extraClassPath}:/home/hadoop/mongo-hadoop-spark.jar",
}
}
You can specify it in your emr template as follows
Classification: spark-defaults
ConfigurationProperties:
spark.jars: Your jar location
Specifying full path for all additional jars while job sumit will work for you.
-- jars
This option Will submit these jars to all the executors and will not change default extra classpath
One more option I know but I only tried it with Yarn conf not sure about EMR though
./bin/spark-submit --class "SparkTest" --master local[*] --jars /fullpath/first.jar,/fullpath/second.jar /fullpath/your-program.jar
You can put "spark.jars" in spark-defaults.conf
so even if you are using notebook this configuration will be used. Hope it will solve your problem
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.