简体   繁体   English

Pyspark将数据写入配置单元

[英]Pyspark writing data into hive

Below is my code to write data into Hive 下面是我将数据写入Hive的代码

from pyspark import since,SparkContext as sc
from pyspark.sql import SparkSession
from pyspark.sql.functions import _functions , isnan
from pyspark.sql import SQLContext
from pyspark.sql.types import *
from pyspark import HiveContext as hc


spark = SparkSession.builder.appName("example-spark").config("spark.sql.crossJoin.enabled","true").config('spark.sql.warehouse.dir',"file:///C:/spark-2.0.0-bin-hadoop2.7/bin/metastore_db/spark-warehouse").config('spark.rpc.message.maxSize','1536').getOrCreate()
Name= spark.read.csv("file:///D:/valid.csv", header="true",inferSchema = 
True,sep=',')

join_df=join_df.where("LastName != ''").show()  
join_df.registerTempTable("test")
hc.sql("CREATE TABLE dev_party_tgt_repl STORED AS PARQUETFILE AS SELECT * from dev_party_tgt")

After executing the above code I get below error 执行上面的代码后,我得到以下错误

Traceback (most recent call last):
File "D:\01 Delivery Support\01 
easyJet\SparkEclipseWorkspace\SparkTestPrograms\src\NameValidation.py", line 
22, in <module>
join_df.registerTempTable("test")
AttributeError: 'NoneType' object has no attribute 'test'

My System Environment details: 我的系统环境详细信息:

  • OS:Windows 操作系统:Windows
  • Eclipse Neon 日食霓虹灯
  • Spark Version :2.0.0 Spark版本:2.0.0

尝试这个:

join_df.where("LastName != ''").write.saveAsTable("dev_party_tgt_repl")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM