繁体   English   中英

运行我的第一个项目 pyspark 时出错

[英]error while running my first project pyspark

请我想运行pyspark

data = ["Project","Gutenberg’s","Alice’s","Adventures",
"in","Wonderland","Project","Gutenberg’s","Adventures",
"in","Wonderland","Project","Gutenberg’s"]
rdd=spark.sparkContext.parallelize(data)
rdd2=rdd.map(lambda x: (x,1))
for element in rdd2.collect():
    print(element)

我的错误是请问我该如何解决?

谢谢

Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
  File "/Users/johanvu/opt/spark-2.4.3-bin-spark-2.4.3-bin-hadoop2.8/python/lib/pyspark.zip/pyspark/worker.py", line 267, in main
    ("%d.%d" % sys.version_info[:2], version))
Exception: Python in worker has different version 3.6 than that in driver 3.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set.

好吧,我认为您的代码没有任何问题,并且在此版本中可以正常工作。

<code>

Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 2.3.2.3.1.0.319-3
      /_/

Using Python version 2.7.5 (default, Aug 13 2020 02:51:10)
SparkSession available as 'spark'.


>>> data = ["Project","Gutenberg’s","Alice’s","Adventures",
... "in","Wonderland","Project","Gutenberg’s","Adventures",
... "in","Wonderland","Project","Gutenberg’s"]
>>> rdd=spark.sparkContext.parallelize(data)
>>> rdd2=rdd.map(lambda x: (x,1))
>>> for element in rdd2.collect():
...     print(element)
...
('Project', 1)
('Gutenberg\xe2\x80\x99s', 1)
('Alice\xe2\x80\x99s', 1)
('Adventures', 1)
('in', 1)
('Wonderland', 1)
('Project', 1)
('Gutenberg\xe2\x80\x99s', 1)
('Adventures', 1)
('in', 1)
('Wonderland', 1)
('Project', 1)
('Gutenberg\xe2\x80\x99s', 1)
>>>
<code>

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM