简体   繁体   中英

How to convert RDD[(String, String)] into RDD[Array[String]]?

I am trying to append filename to each record in the file. I thought if the RDD is Array it would have been easy for me to do it.

Some help with converting RDD type or solving this problem would be much appreciated!

In (String, String) type

scala> myRDD.first()(1)    
scala><console>:24: error: (String, String) does not take parametersmyRDD.first()(1)  

In Array(string)

scala> myRDD.first()(1)    
scala> res1: String = abcdefgh

My function:

def appendKeyToValue(x: Array[Array[String]){
    for (i<-0 to (x.length - 1)) {
        var key = x(i)(0)
        val pattern = new Regex("\\.")
        val key2 = pattern replaceAllIn(key1,"|")
        var tempvalue = x(i)(1)
        val finalval = tempvalue.split("\n")
        for (ab <-0 to (finalval.length -1)){
            val result = (I am trying to append filename to each record in the filekey2+"|"+finalval(ab))
            }  
        }
}

If you have a RDD[(String, String)] , you can access the first tuple field of the first tuple by calling

val firstTupleField: String = myRDD.first()._1

If you want to convert a RDD[(String, String)] into a RDD[Array[String]] you can do the following

val arrayRDD: RDD[Array[String]] = myRDD.map(x => Array(x._1, x._2))

You may also employ a partial function to destructure the tuples:

val arrayRDD: RDD[Array[String]] = myRDD.map { case (a,b) => Array(a, b) }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM