简体   繁体   English

aws emr无法在bootstrap上更改默认的pyspark python

[英]aws emr can't change default pyspark python on bootstrap

I am using aws with emr, and trying to change to bootstrap script in order to set the default python in pyspark to be python 3, I am following this tutorial 我正在使用aws与emr,并尝试更改为bootstrap脚本,以便将pyspark中的默认python设置为python 3,我正在关注教程

this is changing the /usr/lib/spark/conf/spark-env.sh file, but does not change the python version in pyspark, I am still getting jobs done with python 2.7. 这是改变/usr/lib/spark/conf/spark-env.sh文件,但是没有改变pyspark中的python版本,我仍然使用python 2.7完成工作。 this is only working when I ssh to the machine and specifically use 这只适用于我ssh到机器并专门使用

$source /usr/lib/spark/conf/spark-env.ssh

When I try to add this line to the bootstrap script I am getting bootstrap error that the file is not found. 当我尝试将此行添加到引导程序脚本时,我收到引导程序错误,指出找不到该文件。

/bin/bash: /usr/lib/spark/conf/spark-env.sh: No such file or directory / bin / bash:/usr/lib/spark/conf/spark-env.sh:没有这样的文件或目录

I assume that the file does not exist in this stage. 我假设该阶段不存在该文件。 How can I set the pyspark python to be python 3 in the bootstrap script? 如何在引导脚本中将pyspark python设置为python 3?

Add the following code to software configuration (create emr -> step1: software and steps -> edit software configuration -> enter configuration) 将以下代码添加到软件配置中(创建emr - > step1:软件和步骤 - >编辑软件配置 - >输入配置)

[
  {
     "Classification": "spark-env",
     "Configurations": [
       {
         "Classification": "export",
         "Properties": {
            "PYSPARK_PYTHON": "/usr/bin/python3"
          }
       }
    ]
  }
]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM