When started, Jupyter notebook encounters a problem with module import
import findspark
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-1-ff073c74b5db> in <module>
----> 1 import findspark
ModuleNotFoundError: No module named 'findspark'
Conda list shows that module is here
filelock 3.0.8 py37_0
findspark 1.3.0 py_1 conda-forge
flask 1.0.2 py37_1
Python version
(myenv) mm@mm-HP-EliteBook-8560p:~$ python -V
Python 3.6.8
It seems that my installation is not clean. Three Python lines from .bash_profile
export PYENV_ROOT="$HOME/.pyenv"
export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init -)"
Why do I get import error?
I'd suggest a slightly different route.
/Users/me/spark-2.4.0-bin-hadoop2.7
location. Assuming you're on mac, update your ~/.bash_profile
to contain these entries:
export SPARK_HOME=/Users/me/spark-2.4.0-bin-hadoop2.7 export PYTHONPATH=${SPARK_HOME}/python:$PYTHONPATH export PYTHONPATH=${SPARK_HOME}/python/lib/py4j-0.10.7-src.zip:$PYTHONPATH export PYSPARK_PYTHON=<path to your python location> export PYSPARK_DRIVER_PYTHON=jupyter export PYSPARK_DRIVER_PYTHON_OPTS=notebook PATH=$PATH:$SPARK_HOME/bin
Execute a source ~/.bash_profile
.
pyspark
and it'll open the jupyter notebook. Now your notebook will be tied to this spark installation. If you're using linux, I think the only change is in the syntax for appending stuffs to path, and instead of changing bash_profile
you probably need to change bashrc
file.
Make sure you are using the correct virtualenv .
Create a fresh virtualenv for your work (eg. using 3.7.4 as an example here. Use a version you have installed):
pyenv virtualenv 3.7.4 myenv
You can see which python versions you have installed with:
pyenv versions
And which versions are available for installation with:
pyenv install -l
You can either activate the virtualenv shell with:
pyenv shell myenv
With the virtualenv active, you should see the virtualenv name before your prompt. Something like "(myenv)~$: "
Now install all the python packages as you normally would. Make sure you are in the right virutalenv before you run your packages. You can also set the PYENV_VERSION environment variable to specify the virtualenv to use. Something like:
PYENV_VERSION=myenv python -m pip install findspark
Then
PYENV_VERSION=myenv python -m pip show findspark
Should give you something like:
Name: findspark
Version: 1.3.0
Summary: Find pyspark to make it importable.
Home-page: https://github.com/minrk/findspark
Author: Min RK
Author-email: benjaminrk@gmail.com
License: BSD (3-clause)
Location: /home/tzhuang/.pyenv/versions/3.7.4/envs/myenv/lib/python3.7/site-packages
Requires:
Required-by:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.