简体   繁体   中英

Can't start pyspark (DSE 4.6)

I've installed Datastax enterprise 4.6 in a cluster and I can't figure out why the pyspark throw this error. The scala interface works nicely but the python doesn't. Does anyone have a clue how to fix this?

Python 2.6.6 Centos 6.5

Cheers

bash-4.1$ dse pyspark --master spark://IP:7077
Python 2.6.6 (r266:84292, Jan 22 2014, 01:49:05)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Traceback (most recent call last):
  File "/usr/share/dse/spark/python/pyspark/shell.py", line 33, in <module>
    import pyspark
  File "/usr/share/dse/spark/python/pyspark/__init__.py", line 63, in <module>
    from pyspark.context import SparkContext
  File "/usr/share/dse/spark/python/pyspark/context.py", line 34, in <module>
    from pyspark import rdd
  File "/usr/share/dse/spark/python/pyspark/rdd.py", line 1972
    return {convertColumnValue(v) for v in columnValue}
                                    ^
SyntaxError: invalid syntax
>>>

The PySpark support included in DSE 4.6 requires Python 2.7.x and will throw that error you're seeing on Python 2.6.x. An upcoming patch release should fix the problem with Python 2.6.x. There is not a specific date yet.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM