简体   繁体   English

如何更新结构字段 spark/scala

[英]How to update struct field spark/scala

I have a struct as part of my json .我有一个 struct 作为我的json 的一部分。

store: struct (c1, c2, c3, c4)

I would like to update c2 in place so no new field is created.我想就地更新c2 ,因此不会创建新字段。 After update it should be same struct with new value for c2 .更新后,它应该与c2新值相同。

In spark/scala , I have already tried:spark/scala 中,我已经尝试过:

df.withColumn("store.c2", newVal)

But this creates a new field store.c2 , Columns not part of struct, I am able to update.但这会创建一个新字段store.c2 ,列不是结构的一部分,我可以更新。

df.withColumn("columnTen", newValue)

does not create new field and updated to newValue .不会创建新字段并更新为newValue

you can create a new structtype based on an existing schema by using the following: 您可以使用以下内容基于现有架构创建新的结构类型:

val newstructtype = StructType(existing_df.schema.map( x => {
if (x.name == "fieldname_to_change") StructField("new_fieldname",TimestampType,true)
else x})

doing this 这样做

df.withColumn("store",struct($"store.c1", $"store.c2", $"store.c3", lit(newValue) as "c4"))

will replace the value in the store.c4 field. 将替换store.c4字段中的值。

Since Spark 3.1+, you can use withField on a struct column:从 Spark 3.1+ 开始,您可以在结构列上使用withField

An expression that adds/replaces field in StructType by name.按名称添加/替换StructType的字段的表达式。

df.withColumn("store", $"store".withField("c2", lit(newVal)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM