I have an AWS EMR cluster running pyspark applications (or steps, as its called in aws emr).
I want to set environment variables for the pyspark applications, and put this into the cluster configuration (after some googling):
[
{
"Classification": "spark-defaults",
"Properties": {
"spark.executorEnv.MY_ENV": "some-value"
}
}
]
By the environment variable is not available in the pyspark process.
I also tried:
[
{
"Classification": "yarn-env",
"Properties": {},
"Configurations": [
{
"Classification": "export",
"Properties": {
"MY_ENV": "some-value",
}
}
]
}
]
And then output the environment variables via:
print(os.environ)
MY_ENV
does not show up in any case.
How do I pass environment variables to my pyspark application?
Can you try to put this in spark-env
.
[
{
"Classification": "spark-env",
"Properties": {},
"Configurations": [
{
"Classification": "export",
"Properties": {
"MY_ENV": "some-value",
}
}
]
}
]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.