简体   繁体   中英

Create empty column of StructType in spark dataframe

I need to add an empty column of StructType to an existing DataFrame.

Tried following:

df = df.withColumn("features", typedLit(StructType(Nil)))

And:

df = df.withColumn("features", lit(new GenericRowWithSchema(Array(), StructType(Nil))))

However, in both of the above cases getting an error as unsupported literal type.

In a crude way, one can use a user-defined function to add a column with empty rows:

def addEmptyRowColumn(df: DataFrame, newColumnName: String): DataFrame = {
  val addEmptyRowUdf = udf( () =>
    new GenericRowWithSchema(Array(), StructType(Nil)), StructType(Nil))

  df.withColumn(newColumnName, addEmptyRowUdf())
}

df = addEmptyRowColumn(df, "features")

在单个班轮中且没有 UDF:从 pyspark.sql 导入类型为 T,功能为 F

df.withColumn(newColumnName, F.lit(None).cast(T.StructType()))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM