[英]How to add a new nullable String column in a DataFrame using Scala
There are probably at least 10 question very similar to this, but I still have not found a clear answer.可能至少有10个问题与此非常相似,但我仍然没有找到明确的答案。
How can I add a nullable string column to a DataFrame using scala?如何使用 scala 将可为空的字符串列添加到 DataFrame? I was able to add a column with null values, but the DataType shows null
我能够添加具有 null 值的列,但 DataType 显示 null
val testDF = myDF.withColumn("newcolumn", when(col("UID") =!= "not", null).otherwise(null))
However, the schema shows但是,架构显示
root
|-- UID: string (nullable = true)
|-- IsPartnerInd: string (nullable = true)
|-- newcolumn: null (nullable = true)
I want the new column to be string |-- newcolumn: string (nullable = true)我希望新列是字符串|-- newcolumn: string (nullable = true)
Please don't mark as duplicate, unless it's really the same question and in scala.请不要标记为重复,除非它确实是同一个问题并且在 scala 中。
Just explicitly cast null literal to StringType
.只需将 null 文字显式转换为
StringType
即可。
scala> val testDF = myDF.withColumn("newcolumn", when(col("UID") =!= "not", lit(null).cast(StringType)).otherwise(lit(null).cast(StringType)))
scala> testDF.printSchema
root
|-- UID: string (nullable = true)
|-- newcolumn: string (nullable = true)
Why do you want a column which is always null?为什么你想要一个总是 null 的列? There are several ways, I would prefer the solution with
typedLit
:有几种方法,我更喜欢
typedLit
的解决方案:
myDF.withColumn("newcolumn", typedLit[String](null))
or for older Spark versions:或者对于旧的 Spark 版本:
myDF.withColumn("newcolumn",lit(null).cast(StringType))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.