[英]How to convert RDD to DF in spark scala?
I am new in spark.我是火花新手。 And I am trying to convert below RDD to dataframe but not succeed
我正在尝试将 RDD 以下转换为数据帧但没有成功
val customerRDD = sc.textFile("file:///home/hduser/data//customer.txt") //custId,CustName,CustEmail,CustPhone //1,ABC,abc@gmail.com,+199240242234 val customerRDD = sc.textFile("file:///home/hduser/data//customer.txt") //custId,CustName,CustEmail,CustPhone //1,ABC,abc@gmail.com,+199240242234
Here I am trying to use customerRDD.toDF() method but not working在这里,我尝试使用 customerRDD.toDF() 方法但不工作
Also I have tried with createDataFrame() method but not able to get the idea我也尝试过 createDataFrame() 方法但无法理解
Does anyone can help How can I convert RDD to DF here?有谁可以帮助我如何在这里将 RDD 转换为 DF?
Thanks谢谢
An odd way of doing things these days, but if you must use an RDD to read a file with a header, then consult this https://sparkbyexamples.com/apache-spark-rdd/spark-load-csv-file-into-rdd/ and note specifically:这些天做事的一种奇怪方式,但如果您必须使用 RDD 来读取带有标题的文件,那么请查阅此https://sparkbyexamples.com/apache-spark-rdd/spark-load-csv-file-into -rdd/并特别注意:
Look at this for creating DF from RDD with schema using Structs, see https://sparkbyexamples.com/apache-spark-rdd/convert-spark-rdd-to-dataframe-dataset .查看此内容以使用 Structs 从带有模式的 RDD 创建 DF,请参阅https://sparkbyexamples.com/apache-spark-rdd/convert-spark-rdd-to-dataframe-dataset 。 You can
你可以
createDataFrame()
createDataFrame()
从 RDD 以编程方式为 DF 创建模式implicits
implicits
的默认模式
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.