[英]unable to import pyspark statistics module
Python 2.7,Apache Spark 2.1.0,Ubuntu 14.04在pyspark shell中,出現以下錯誤:
>>> from pyspark.mllib.stat import Statistics Traceback (most recent call last): File "", line 1, in ImportError: No module named stat
解決方法
類似地
>>> from pyspark.mllib.linalg import SparseVector Traceback (most recent call last): File "", line 1, in ImportError: No module named linalg
我已經安裝了numpy並且
>>> sys.path ['', u'/tmp/spark-2d5ea25c-e2e7-490a-b5be-815e320cdee0/userFiles-2f177853-e261-46f9-97e5-01ac8b7c4987', '/usr/local/lib/python2.7/dist-packages/setuptools-18.1-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/pyspark-2.1.0+hadoop2.7-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/py4j-0.10.4-py2.7.egg', '/home/d066537/spark/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip', '/home/d066537/spark/spark-2.1.0-bin-hadoop2.7/python', '/home/d066537', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PILcompat', '/usr/lib/python2.7/dist-packages/gst-0.10', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/python2.7/dist-packages/ubuntu-sso-client']
刪除pyspark安裝。
sudo -H pip uninstall pyspark
我也有同樣的問題。 Python文件stat.py
似乎不在Spark 2.1.x中,而在Spark 2.2.x中。 因此,似乎您需要使用其更新的pyspark升級Spark(但是Zeppelin 0.7.x似乎無法與Spark 2.2.x一起使用)。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.