簡體   English   中英

無法導入pyspark統計信息模塊

[英]unable to import pyspark statistics module

Python 2.7,Apache Spark 2.1.0,Ubuntu 14.04在pys​​park shell中,出現以下錯誤:

>>> from pyspark.mllib.stat import Statistics
Traceback (most recent call last):
  File "", line 1, in 
ImportError: No module named stat

解決方法

類似地

>>> from pyspark.mllib.linalg import SparseVector
Traceback (most recent call last):
  File "", line 1, in 
ImportError: No module named linalg

我已經安裝了numpy並且

>>> sys.path
['', u'/tmp/spark-2d5ea25c-e2e7-490a-b5be-815e320cdee0/userFiles-2f177853-e261-46f9-97e5-01ac8b7c4987', '/usr/local/lib/python2.7/dist-packages/setuptools-18.1-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/pyspark-2.1.0+hadoop2.7-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/py4j-0.10.4-py2.7.egg', '/home/d066537/spark/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip', '/home/d066537/spark/spark-2.1.0-bin-hadoop2.7/python', '/home/d066537', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PILcompat', '/usr/lib/python2.7/dist-packages/gst-0.10', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/python2.7/dist-packages/ubuntu-sso-client']

刪除pyspark安裝。

sudo -H pip uninstall pyspark

我也有同樣的問題。 Python文件stat.py似乎不在Spark 2.1.x中,而在Spark 2.2.x中。 因此,似乎您需要使用其更新的pyspark升級Spark(但是Zeppelin 0.7.x似乎無法與Spark 2.2.x一起使用)。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM