简体   繁体   English

Amazon EMR pip 安装在引导操作中运行正常但没有效果

[英]Amazon EMR pip install in bootstrap actions runs OK but has no effect

In Amazon EMR, I am using the following script as a custom bootstrap action to install python packages.在 Amazon EMR 中,我使用以下脚本作为自定义引导操作来安装 python 个程序包。 The script runs OK (checked the logs, packages installed successfully) but when I open a notebook in Jupyter Lab, I cannot import any of them.脚本运行正常(检查日志,包安装成功)但是当我在 Jupyter Lab 中打开笔记本时,我无法导入其中任何一个。 If I open a terminal in JupyterLab and run pip list or pip3 list , none of my packages is there.如果我在 JupyterLab 中打开一个终端并运行pip listpip3 list ,我的包都不在那里。 Even if I go to / and run find. -name mleap即使我 go 到/并运行find. -name mleap find. -name mleap for instance, it does not exist. find. -name mleap例如,它不存在。

Something I have noticed is that on the master node, I am getting all the time an error saying bootstrap action 2 has failed (there is no second action, only one).我注意到,在主节点上,我一直收到一条错误消息,提示引导操作 2 失败(没有第二个操作,只有一个)。 According to this , it is a rare error which I get in all my clusters. 据此,这是我在所有集群中遇到的罕见错误。 However, my cluster eventually gets created and I can use it.但是,我的集群最终被创建并且我可以使用它。

My script is called aws-emr-bootstrap-actions.sh我的脚本叫做aws-emr-bootstrap-actions.sh

#!/bin/bash

sudo python3 -m pip install numpy scikit-learn pandas mleap sagemaker boto3

I suspect it might have something to do with a docker image being deployed that invalidates my previous installs or something, but I think (for my Google searches) it is common to use bootstrap actions to install python packages and should work...我怀疑这可能与正在部署的 docker 图像有关,该图像使我以前的安装或其他东西无效,但我认为(对于我的谷歌搜索)通常使用引导操作来安装 python 包并且应该工作......

The PYSPARK , Python interpreter that Spark is using, is different than the one to which the OP was installing the modules (as confirmed in comments). Spark 使用的PYSPARK 、 Python 解释器与 OP 安装模块的解释器不同(如评论中所确认)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM