[英]iterative lookup from within rdd.map in scala
def retrieveindex (stringlist: List[String], lookuplist: List[String]) =
stringlist.foreach(y => lookuplist.indexOf(y))
is my function. 是我的职责
I am trying to use this within an rdd like this: 我正在尝试在rdd中使用以下代码:
val libsvm = libsvmlabel.map(x =>
Array(x._2._2,retrieveindex(x._2._1.toList,featureSet.toList)))
However, I am getting an output that is empty. 但是,我得到的输出是空的。 There is no error, but the output from retrieveindex is empty.
没有错误,但是retrieveindex的输出为空。 When I use println to see if I am retrieving correctly, I do see the indices printed.
当我使用println查看是否正确检索时,确实看到了打印的索引。 Is there any way to do this?
有什么办法吗? Should I first 'distribute' the function to all the workers?
我是否应该首先将功能“分配”给所有工人? I am a newbie.
我是新手。
retrieveindex
has a return type of type Unit
(because of foreach
which just applies a function (String) ⇒ Unit
on each element). retrieveindex
具有类型为Unit
的返回类型(因为foreach
仅在每个元素上应用函数(String) ⇒ Unit
)。 Therefore it does not map to anything. 因此,它不会映射到任何东西。
You probably want it to return the list of indices, like: 您可能希望它返回索引列表,例如:
def retrieveindex(stringlist: List[String], lookuplist: List[String]): List[Int] =
stringlist.map(y => lookuplist.indexOf(y))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.