简体   繁体   中英

Amazon EMR pip install in bootstrap actions runs OK but has no effect

In Amazon EMR, I am using the following script as a custom bootstrap action to install python packages. The script runs OK (checked the logs, packages installed successfully) but when I open a notebook in Jupyter Lab, I cannot import any of them. If I open a terminal in JupyterLab and run pip list or pip3 list , none of my packages is there. Even if I go to / and run find. -name mleap find. -name mleap for instance, it does not exist.

Something I have noticed is that on the master node, I am getting all the time an error saying bootstrap action 2 has failed (there is no second action, only one). According to this , it is a rare error which I get in all my clusters. However, my cluster eventually gets created and I can use it.

My script is called aws-emr-bootstrap-actions.sh

#!/bin/bash

sudo python3 -m pip install numpy scikit-learn pandas mleap sagemaker boto3

I suspect it might have something to do with a docker image being deployed that invalidates my previous installs or something, but I think (for my Google searches) it is common to use bootstrap actions to install python packages and should work...

The PYSPARK , Python interpreter that Spark is using, is different than the one to which the OP was installing the modules (as confirmed in comments).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM