[英]Spark Scala | create Dataframe Dyanmically
I would like to create dataframe names dynamically from a collection. 我想从集合中动态创建数据框名称。
Please see below: 请看下面:
val set1 = Set("category1","category2","category3")
The following is a UDF which takes a string x
from the set as input and generate the dataframe accordingly: 以下是UDF,它从集合中获取字符串
x
作为输入并相应地生成数据帧:
def catDfgen(x: String): DataFrame = {
spark.sql(s"select * from table where col1 = '$x'")
}
Now I need help here, to create not only DataFrame but also the DataFrame name should be dynamically generated in order to achieve 现在我需要帮助,不仅要创建DataFrame,还应该动态生成DataFrame名称,以实现
val category1DF = catDfgen($x)
val category2DF = catDfgen($x)
...etc. ...等等。 Would it be possible to do it using the code below?
是否可以使用下面的代码来做到这一点?
set1.map( x => val $x+"DF" = catDfgen($x))
If not please suggest an effective method. 如果没有,请提出一种有效的方法。
Suman, I believe the below might help your use-case Suman,我相信以下内容可能会对您的用例有所帮助
import org.apache.spark.sql.{DataFrame, SparkSession}
object Test extends App {
val spark: SparkSession = SparkSession.builder().master("local").getOrCreate()
val set1 = Set("category1","category2","category3")
val dfs: Map[String, DataFrame] = set1.map(x =>
(s"${x}DF", spark.sql(s"select * from table where col1 = '$x'").alias(s"${x}DF").toDF())
).toMap
dfs("category1DF").show()
spark.stop()
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.