I need to run a PySpark application (v1.6.3). There is the --py-files
flag to add .zip, .egg, or .py files. If I had a Python package/module at /usr/anaconda2/lib/python2.7/site-packages/fuzzywuzzy
, how would I include this whole module?
Inside this directory, I do notice some *.py and *.pyc files.
Would I have to include each of these one-by-one? For example.
spark-submit \
--py-files /usr/anaconda2/lib/python2.7/site-packages/fuzzywuzzy/fuzz.py,/usr/anaconda2/lib/python2.7/site-packages/fuzzywuzzy/process.py,/usr/anaconda2/lib/python2.7/site-packages/fuzzywuzzy/StringMatcher.py,/usr/anaconda2/lib/python2.7/site-packages/fuzzywuzzy/string_processing.py,/usr/anaconda2/lib/python2.7/site-packages/fuzzywuzzy/utils.py
Is there an easier way?
Any tips or pointers would be greatly appreciated. In reality, there are more Python modules managed by conda that I need.
I suggest doing it in other direction. Installing pyspark
to Anaconda with:
conda install -c conda-forge pyspark=2.1.1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.