I would like to install Pyspark 2.4.4. I have seen that I can download the Spark package or use pip install. I only need Pyspark, are they the same with both installations?
you could do python pip install pyspark
but it doesn't come with Hadoop binaries which is necessary for the spark to function properly.
The easiest way to install is by using python findspark
download .tgz file from the spark website which comes with Hadoop binaries
pip install findspark
In Python:
import findspark
finspark.init('\path\to\extracted\binaries\folder')
import pyspark
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.