简体   繁体   English

Spark Scala:检索架构并存储它

[英]Spark Scala: retrieve the schema and store it

Is it possible to retrieve the schema of an RDD and store it in a variable? 是否可以检索RDD的模式并将其存储在变量中? Because I want to create a new data frame from another RDD using the same schema. 因为我想使用相同的模式从另一个RDD创建一个新的数据框。 For example, below is what I am hoping to have: 例如,以下是我希望拥有的内容:

val schema = oldDF.getSchema()
val newDF = sqlContext.createDataFrame(rowRDD, schema)

Assuming I already have rowRDD in the format of RDD[org.apache.spark.sql.Row] , is this something possible? 假设我已经有了RDD[org.apache.spark.sql.Row]格式的rowRDD ,这有可能吗?

Just use schema attribute 只需使用schema属性

val oldDF = sqlContext.createDataFrame(sc.parallelize(Seq(("a", 1))))
val rowRDD = sc.parallelize(Seq(Row("b", 2))

sqlContext.createDataFrame(rowRDD, oldDF.schema)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM