简体   繁体   English

Jupyter问题与python / pyspark版本

[英]Jupyter issue with python / pyspark versions

I am running a Jupyter notebook using the pyspark kernel. 我使用pyspark内核运行Jupyter笔记本。 I am getting the following error. 我收到以下错误。 How can I force Jupyter (ideally from within Jupyter) to use the right driver? 我如何强制Jupyter(理想情况下来自Jupyter内部)使用正确的驱动程序?

Python in worker has different version 2.6 than that in driver 2.7, PySpark cannot run with different minor versions 工作中的Python与驱动程序2.7中的Python版本不同,PySpark不能与不同的次要版本一起运行

Thank you 谢谢

Hani 哈尼

It could be a problem in your pyspark kernel.json configuration. 这可能是你的pyspark kernel.json配置中的一个问题。 For example my pyspark kernel is located at: 例如,我的pyspark内核位于:

/usr/local/share/jupyter/kernels/pyspark/kernel.json

and contains: 并包含:

{
 "display_name": "pySpark (Spark 1.6.0)",
 "language": "python",
 "argv": [
  "/usr/local/bin/python2.7",
  "-m",
  "ipykernel",
  "-f",
  "{connection_file}"
 ],
 "env": {
  "PYSPARK_PYTHON": "/usr/local/bin/python2.7",
  "SPARK_HOME": "/usr/lib/spark",
  "PYTHONPATH": "/usr/lib/spark/python/lib/py4j-0.9-src.zip:/usr/lib/spark/python/",
  "PYTHONSTARTUP": "/usr/lib/spark/python/pyspark/shell.py",
  "PYSPARK_SUBMIT_ARGS": "--master yarn-client pyspark-shell"
 }
}

It is very important to point the same python version in both places ( argv and PYSPARK_PYTHON ). 在两个地方( argvPYSPARK_PYTHON )指向相同的python版本非常重要

Hope that helps! 希望有所帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM