简体   繁体   English

在Spark中使用dataFrame将NULL值插入Hive

[英]Insert NULL values into Hive with dataFrame in Spark

I'm trying insert values into a Hive table, and if every column has a value there are no problems, but I need to insert NULL value in one of that columns. 我正在尝试将值插入到Hive表中,并且如果每个列都有一个值,就没有问题,但是我需要在该列之一中插入NULL值。 I'm doing it in that way: 我是这样做的:

val errorsToAlert = List(("source1", "table1","27-01-2002", null))
val data = sqlContext.createDataFrame(errorsToAlert).toDF("source", 
"table_name", "open_date", "close_date")
data.write.mode("append").saveAsTable("management.alerts")

I've tried with NULL and None, but both represent this error: 我尝试使用NULL和None,但是都代表此错误:

17/06/26 11:59:38 ERROR yarn.ApplicationMaster: User class threw exception: 17/06/26 11:59:38错误yarn.ApplicationMaster:用户类引发异常:
scala.MatchError: scala.None.type (of class scala.reflect.internal.Types$UniqueSingleType) scala.MatchError:scala.None.type(属于scala.reflect.internal.Types $ UniqueSingleType类)
scala.MatchError: scala.None.type (of class scala.reflect.internal.Types$UniqueSingleType) scala.MatchError:scala.None.type(属于scala.reflect.internal.Types $ UniqueSingleType类)

The problem is completely unrelated to Hive. 该问题与Hive完全无关。 If you check the type of errorsToAlert you'll see it is: 如果检查errorsToAlert的类型,您将看到:

List[(String, String, String, Null)]

and scala.Null is not an acceptable input for Dataset . scala.Null不是Dataset可接受的输入。

If required type itself is nullable you can specify it explicitly: 如果必需的类型本身可以为空,则可以显式指定它:

sqlContext.createDataFrame(Seq(
  ("source1", "table1","27-01-2002", null: String)
))

otherwise use scala.Option : 否则使用scala.Option

sqlContext.createDataFrame(Seq(
  ("source1", "table1","27-01-2002", None: Option[Int])
))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM