简体   繁体   English

如何将数据帧转换为RDD [String,String]?

[英]How to convert a dataframe to RDD[String, String]?

How to convert a dataframe to RDD[String, String] ? 如何将数据帧转换为RDD [String,String]?

I have a data frame 我有一个数据框

df : [id : String, coutry :String, title: String]

How to do I convert it to RDD[String, String] where the first column would be key and the json string made of remaining columns would be value ? 我如何将其转换为RDD [String,String],其中第一列将是键,而其余列构成的json字符串将是value?

key : id
value : {coutry: "US", title : "MK"}

You can not have a RDD[String, String] . 您不能具有RDD[String, String] RDD takes only 1 type parameter so what you want is RDD[(String, String)] . RDD仅接受1个type parameter因此您需要的是RDD[(String, String)]

df.rdd
  .map(row => {
    val id = row.getString(0)
    val country = row.getString(1)
    val title = row.getString(2)

    val jsonString = s"{country: $country, title: $title}"

    (id, jsonString)
  })

有一个DataFrame.toJSON返回一个RDD [String],基于此方法,您可以自己进行转换

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM