简体   繁体   中英

spark RDD sort by two values

I have a RDD of (name:String, popularity:Int, rank:Int) . I want to sort this by rank and if rank matches then by popularity . I am doing so by two transformations.

var result = myRDD
        .sortBy(_._2, ascending = false)
        .sortBy(_._3, ascending = false)
        .take(10)

Can I do the it in one transformation?

You can try make an RDD of key value where key will be Tuple composed from rank and popularity and value will be name and sort by the key.

For example:

// _._1 - name

// _._2 - popularity

// _._3 - rank

var tupledRDD = myRDD.map(line => ((line._3, line._2), line._1))
.sortBy(_._1, ascending=false)
.take(10)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM