Is it possible to cast a StringType column to an ArrayType column in a spark dataframe ?
df.printSchema()
gives this
Schema ->
a: string(nullable= true)
Now I want to convert this to
a: array(nullable= true)
As elisiah commented you have to split your string. You can use UDF:
df.printSchema
import org.apache.spark.sql.functions._
val toArray = udf[Array[String], String]( _.split(" "))
val featureDf = df
.withColumn("a", toArray(df("a")))
featureDF.printSchema
Gives output:
root
|-- a: string (nullable = true)
root
|-- a: array (nullable = true)
| |-- element: string (containsNull = true)
简单地将任何column
包装在functions.array
另一种选择。
df.withColumn("a", functions.array(col("a")))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.