简体   繁体   English

NameError: name 'spark' is not defined,如何解决?

[英]NameError: name 'spark' is not defined, how to solve?

I have just installed pyspark2.4.5 in my ubuntu18.04 laptop, and when I run following codes,我刚刚在我的 ubuntu18.04 笔记本电脑上安装了 pyspark2.4.5,当我运行以下代码时,

#this is a part of the code. 
import pubmed_parser as pp
from pyspark.sql import SparkSession
from pyspark.sql import Row

medline_files_rdd = spark.sparkContext.parallelize(glob('/mnt/hgfs/ShareDir/data/*.gz'), numSlices=1000)
parse_results_rdd = medline_files_rdd.\
    flatMap(lambda x: [Row(file_name=os.path.basename(x), **publication_dict)
                       for publication_dict in pp.parse_medline_xml(x)])

medline_df = parse_results_rdd.toDF()
# save to parquet
medline_df.write.parquet('raw_medline.parquet', mode='overwrite')


medline_df = spark.read.parquet('raw_medline.parquet')

I get such error,我得到这样的错误,

medline_files_rdd = spark.sparkContext.parallelize(glob('/mnt/hgfs/ShareDir/data/*.gz'), numSlices=1000)
NameError: name 'spark' is not defined

I have seen similiar questions on StackOverflow, but all of them can not solve my problem.Does anyone can help me?Thanks a lot.我在 StackOverflow 上看到过类似的问题,但它们都无法解决我的问题。有人可以帮助我吗?非常感谢。

By the way, I am new in spark, if I just want to use spark in Python, does it enough that I just install pyspark by using pip install pyspark ? By the way, I am new in spark, if I just want to use spark in Python, does it enough that I just install pyspark by using pip install pyspark ? any others should I do?我应该做其他任何事情吗? Should I install Hadoop or others?我应该安装 Hadoop 还是其他?

Just create spark session in the starting只需在启动中创建火花 session

from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('abc').getOrCreate()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何解决“NameError: name 'indices' is not defined”? - how to solve “NameError: name 'indices' is not defined”? 如何解决 NameError: name 'df' is not defined - How to solve NameError: name 'df' is not defined 如何解决“ NameError:名称'model'未定义”错误? - How to solve “ NameError: name 'model' is not defined ” error? 如何解决`NameError: name 'compression' is not defined`? - How to solve `NameError: name 'compression' is not defined`? 我如何解决NameError:在Python 3.3中没有定义名称'threading' - How do I solve NameError: name 'threading' is not defined in python 3.3 如何解决 python 的 NameError: name 'xx' is not defined? - How to solve python's NameError: name 'xx' is not defined? 如何解决“NameError: name 'ArffDecoder' is not defined”? - How can I solve "NameError: name 'ArffDecoder' is not defined"? 如何解决 OpenCV 出现“NameError: name 'frame' is not defined”错误? - How to solve the "NameError: name 'frame' is not defined" error occured in OpenCV? 如何解决 NameError: name 'randomResponce' is not defined - How do I solve the NameError: name 'randomResponce' is not defined 我该如何解决这个错误? NameError:未定义名称“模型” - How can I solve this error? NameError: name ‘model’ is not defined
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM