简体   繁体   English

使用 RDD.mapPartitionsWithIndex 时如何获取每个分区的索引?

[英]how to get an index of each partition when using RDD.mapPartitionsWithIndex?

I am new in spark and scala.我是 spark 和 scala 的新手。 Is there a way in Spark to get the Partition ID/No from RDD.mapPartitionsWithIndex where it defined as follows: Spark 中有没有办法从RDD.mapPartitionsWithIndex获取分区 ID/No,它的定义如下:

def randomint(index: Int, iter: Iterator[T]) : Iterator[(Int, T)]={
...
}
self.mapPartitionsWithIndex(randomint).partitionBy(new randParti(nump)).values

Your naming might be confusing, but the index variable in the randomint function does contain what you are looking for: the partition no.您的命名可能会令人困惑,但randomint function 中的index变量确实包含您要查找的内容:分区号。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM