永久设置findspark.init（）

Question

I have Apache Spark installed on ubuntu at this path /home/mymachine/spark-2.1.0-bin-hadoop2.7 so I have to go to python directory, located under this directory, to be able using spark OR I can use it outside python directory with help from a library called findspark, however it seems I have to always init this library like this: 我在ubuntu的/home/mymachine/spark-2.1.0-bin-hadoop2.7路径上安装了Apache Spark，因此我必须转到该目录下的python目录，才能使用spark或我可以使用它在名为findspark的库的帮助下在python目录外部，但是似乎我必须始终像这样初始化该库：

import findspark
findspark.init("/home/mymachine/spark-2.1.0-bin-hadoop2.7")

everytime I want to use findspark , which is not very effective. 每次我想使用findspark ，效果不是很好。 Is there anyway to init this library permanently? 无论如何，有没有永久初始化该库的方法？

At here it mentioned need to set a variable SPARK_HOME on .bash_profile and I did it, but no luck. 在这里，它提到需要在.bash_profile上设置变量SPARK_HOME ，而我做到了，但是没有运气。

Answer 1

Add the following variables to your .bashrc file 将以下变量添加到您的.bashrc文件中

export SPARK_HOME=/path/2/spark/folder
export PATH=$SPARK_HOME/bin:$PATH

then source .bashrc 然后source .bashrc
If you wish run to pyspark with jupyter notebook, add these variables to .bashrc 如果您希望使用jupyter notebook运行pyspark，请将这些变量添加到.bashrc中

export PYSPARK_DRIVER_PYTHON=jupyter
export PYSPARK_DRIVER_PYTHON_OPTS='notebook'

again source .bashrc 再次source .bashrc
Now if you run pyspark from shell, it will launch jupyter notebook server and pyspark will be availble on python kernels. 现在，如果您从shell运行pyspark ，它将启动jupyter笔记本服务器，并且pyspark将在python内核上可用。

永久设置findspark.init（）

问题描述

1 个解决方案

解决方案1
0 2017-09-26 08:27:22

永久设置findspark.init（）

问题描述

1 个解决方案

解决方案1 0 2017-09-26 08:27:22

解决方案1
0 2017-09-26 08:27:22