简体   繁体   English

从Scala的rdd.map内部进行迭代查找

[英]iterative lookup from within rdd.map in scala

def retrieveindex (stringlist: List[String], lookuplist: List[String]) = 
  stringlist.foreach(y => lookuplist.indexOf(y))

is my function. 是我的职责

I am trying to use this within an rdd like this: 我正在尝试在rdd中使用以下代码:

val libsvm = libsvmlabel.map(x => 
  Array(x._2._2,retrieveindex(x._2._1.toList,featureSet.toList)))

However, I am getting an output that is empty. 但是,我得到的输出是空的。 There is no error, but the output from retrieveindex is empty. 没有错误,但是retrieveindex的输出为空。 When I use println to see if I am retrieving correctly, I do see the indices printed. 当我使用println查看是否正确检索时,确实看到了打印的索引。 Is there any way to do this? 有什么办法吗? Should I first 'distribute' the function to all the workers? 我是否应该首先将功能“分配”给所有工人? I am a newbie. 我是新手。

retrieveindex has a return type of type Unit (because of foreach which just applies a function (String) ⇒ Unit on each element). retrieveindex具有类型为Unit的返回类型(因为foreach仅在每个元素上应用函数(String) ⇒ Unit )。 Therefore it does not map to anything. 因此,它不会映射到任何东西。

You probably want it to return the list of indices, like: 您可能希望它返回索引列表,例如:

def retrieveindex(stringlist: List[String], lookuplist: List[String]): List[Int] = 
  stringlist.map(y => lookuplist.indexOf(y))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM