简体   繁体   English

将具有文字值的新列添加到 Spark Scala 中 Dataframe 中的结构列

[英]Add new column with literal value to a struct column in Dataframe in Spark Scala

I have a dataframe with the following schema:我有一个 dataframe 具有以下架构:

root
 |-- docnumber: string (nullable = true)
 |-- event: struct (nullable = false)
 |    |-- data: struct (nullable = true)
           |-- codevent: int (nullable = true)

I need to add a column inside event.data so that the schema would be like:我需要在event.data中添加一列,以便架构如下所示:

root
 |-- docnumber: string (nullable = true)
 |-- event: struct (nullable = false)
 |    |-- data: struct (nullable = true)
           |-- codevent: int (nullable = true)
           |-- needtoaddit: int (nullable = true)

I tried我试过了

  • dataframe.withColumn("event.data.needtoaddit", lit("added"))

    but it adds a column with name event.data.needtoaddit但它添加了一个名为event.data.needtoaddit的列

  • dataframe.withColumn( "event", struct( $"event.*", struct( lit("added").as("needtoaddit") ).as("data") ) )

    but it creates an ambiguous column named event.data and again I have a problem.但它创建了一个名为event.data的模棱两可的列,我又遇到了问题。

How can I make it work?我怎样才能让它工作?

You're kind of close.你有点接近。 Try this code:试试这个代码:

val df2 = df.withColumn(
    "event", 
    struct(
        struct(
            $"event.data.*", 
            lit("added").as("needtoaddit")
        ).as("data")
    )
)

Spark 3.1+火花 3.1+

To add fields inside struct columns, use withField要在结构列中添加字段,请使用withField

col("event.data").withField("needtoaddit", lit("added"))

Input:输入:

val df = spark.createDataFrame(Seq(("1", 2)))
    .select(
        col("_1").as("docnumber"),
        struct(struct(col("_2").as("codevent")).as("data")).as("event")
    )
df.printSchema()
// root
//  |-- docnumber: string (nullable = true)
//  |-- event: struct (nullable = false)
//  |    |-- data: struct (nullable = false)
//  |    |    |-- codevent: long (nullable = true)

Script:脚本:

val df2 = df.withColumn(
    "event",
    col("event.data").withField("needtoaddit", lit("added"))
)

df2.printSchema()
// root
//  |-- docnumber: string (nullable = true)
//  |-- event: struct (nullable = false)
//  |    |-- data: struct (nullable = true)
//            |-- codevent: int (nullable = true)
//            |-- needtoaddit: int (nullable = true)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM