I need to add an empty column of StructType to an existing DataFrame.
Tried following:
df = df.withColumn("features", typedLit(StructType(Nil)))
And:
df = df.withColumn("features", lit(new GenericRowWithSchema(Array(), StructType(Nil))))
However, in both of the above cases getting an error as unsupported literal type.
In a crude way, one can use a user-defined function to add a column with empty rows:
def addEmptyRowColumn(df: DataFrame, newColumnName: String): DataFrame = {
val addEmptyRowUdf = udf( () =>
new GenericRowWithSchema(Array(), StructType(Nil)), StructType(Nil))
df.withColumn(newColumnName, addEmptyRowUdf())
}
df = addEmptyRowColumn(df, "features")
在单个班轮中且没有 UDF:从 pyspark.sql 导入类型为 T,功能为 F
df.withColumn(newColumnName, F.lit(None).cast(T.StructType()))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.