简体   繁体   English

火花类型不匹配错误

[英]Spark type mismatch error

I have a function below:- 我有以下功能:-

def doSomething(line: RDD[(String, String)]): (String) = {
       val c = line.toLocalIterator.mkString
       val file2 = KeepEverythingExtractor.INSTANCE.getText(c)
       (file2)
    }      

It's of type org.apache.spark.rdd.RDD[(String, String)])String 类型为org.apache.spark.rdd.RDD[(String, String)])String

I have some files stored at hdfs which I have to access as below:- 我在hdfs中存储了一些文件,必须按以下方式访问它们:-

val logData = sc.wholeTextFiles("hdfs://localhost:9000/home/akshat/recipes/recipes/simplyrecipes/*/*/*/*")

It's of type org.apache.spark.rdd.RDD[(String, String)] 类型为org.apache.spark.rdd.RDD[(String, String)]

I have to map these files according to doSomething function 我必须根据doSomething函数映射这些文件

val mapper = logData.map(doSomething)

But an error comes out like this:- 但是会出现这样的错误:

<console>:32: error: type mismatch;
 found   : org.apache.spark.rdd.RDD[(String, String)] => String
 required: ((String, String)) => ?
       val mapper = logData.map(doSomething)
                                ^

I have defined in my function what type of input and output I should have and I am giving the input according to that only. 我在函数中定义了我应具有的输入和输出类型,并且仅根据该输入给出输入。 Why is this error coming then and what should I change in order to rectify this error? 为什么会出现此错误?为了纠正此错误,我应该更改什么?
Thanks in advance! 提前致谢!

What is passed to map function is not RDD[(String, String)] but sequence of pairs (String, String) , hence the error. 传递给map函数的不是RDD[(String, String)]而是对序列(String, String) ,因此是错误。 Same way when you map over list you don't get list itself, but elements of the list, one by one. 当您在列表上进行映射时,您获得的并不是列表本身,而是列表中的元素,一一对应。

Lets say want to extract file path then what you need is something like this: 假设要提取文件路径,那么您需要的是这样的东西:

def doSomething(x: (String, String)): String = {
    x match {
        case (fname, content) => fname
    }
}

or simply: 或者简单地:

logData.map(_._1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM