Scala Spark流文件流

Question

Similar to this question I'm trying to use fileStream but receiving a compile-time error about the type arguments. 与此问题类似，我正在尝试使用fileStream但收到有关类型参数的编译时错误。 I'm trying to ingest XML data using org.apache.mahout.text.wikipedia.XmlInputFormat provided by mahout-examples as my InputFormat type. 我正在尝试使用mahout-examples提供的org.apache.mahout.text.wikipedia.XmlInputFormat作为我的InputFormat类型来摄取XML数据。

val fileStream = ssc.fileStream[LongWritable, Text, XmlInputFormat](WATCHDIR)

The compilation errors are: 编译错误为：

Error:(39, 26) type arguments [org.apache.hadoop.io.LongWritable,scala.xml.Text,org.apache.mahout.text.wikipedia.XmlInputFormat] conform to the bounds of none of the overloaded alternatives of
 value fileStream: [K, V, F <: org.apache.hadoop.mapreduce.InputFormat[K,V]](directory: String, filter: org.apache.hadoop.fs.Path => Boolean, newFilesOnly: Boolean, conf: org.apache.hadoop.conf.Configuration)(implicit evidence$12: scala.reflect.ClassTag[K], implicit evidence$13: scala.reflect.ClassTag[V], implicit evidence$14: scala.reflect.ClassTag[F])org.apache.spark.streaming.dstream.InputDStream[(K, V)] <and> [K, V, F <: org.apache.hadoop.mapreduce.InputFormat[K,V]](directory: String, filter: org.apache.hadoop.fs.Path => Boolean, newFilesOnly: Boolean)(implicit evidence$9: scala.reflect.ClassTag[K], implicit evidence$10: scala.reflect.ClassTag[V], implicit evidence$11: scala.reflect.ClassTag[F])org.apache.spark.streaming.dstream.InputDStream[(K, V)] <and> [K, V, F <: org.apache.hadoop.mapreduce.InputFormat[K,V]](directory: String)(implicit evidence$6: scala.reflect.ClassTag[K], implicit evidence$7: scala.reflect.ClassTag[V], implicit evidence$8: scala.reflect.ClassTag[F])org.apache.spark.streaming.dstream.InputDStream[(K, V)]
    val fileStream = ssc.fileStream[LongWritable, Text, XmlInputFormat](WATCHDIR)
                         ^
Error:(39, 26) wrong number of type parameters for overloaded method value fileStream with alternatives:
  [K, V, F <: org.apache.hadoop.mapreduce.InputFormat[K,V]](directory: String, filter: org.apache.hadoop.fs.Path => Boolean, newFilesOnly: Boolean, conf: org.apache.hadoop.conf.Configuration)(implicit evidence$12: scala.reflect.ClassTag[K], implicit evidence$13: scala.reflect.ClassTag[V], implicit evidence$14: scala.reflect.ClassTag[F])org.apache.spark.streaming.dstream.InputDStream[(K, V)] <and>
  [K, V, F <: org.apache.hadoop.mapreduce.InputFormat[K,V]](directory: String, filter: org.apache.hadoop.fs.Path => Boolean, newFilesOnly: Boolean)(implicit evidence$9: scala.reflect.ClassTag[K], implicit evidence$10: scala.reflect.ClassTag[V], implicit evidence$11: scala.reflect.ClassTag[F])org.apache.spark.streaming.dstream.InputDStream[(K, V)] <and>
  [K, V, F <: org.apache.hadoop.mapreduce.InputFormat[K,V]](directory: String)(implicit evidence$6: scala.reflect.ClassTag[K], implicit evidence$7: scala.reflect.ClassTag[V], implicit evidence$8: scala.reflect.ClassTag[F])org.apache.spark.streaming.dstream.InputDStream[(K, V)]
    val fileStream = ssc.fileStream[LongWritable, Text, XmlInputFormat](WATCHDIR)
                     ^

I'm very new to Scala, so I'm not really familiar with type classes (I'm assuming that's what's happening here?). 我对Scala还是很陌生，所以我对类型类不是很熟悉（我假设这就是这里发生的事情？）。 Any help would be appreciated. 任何帮助，将不胜感激。

Answer 1

该异常列出了它正在搜索scala.xml.Text ，而您需要使用org.apache.hadoop.io.Text

Scala Spark流文件流

问题描述

1 个解决方案

解决方案1
1 已采纳 2015-08-17 19:53:17

Scala Spark流文件流

问题描述

1 个解决方案

解决方案1 1 已采纳 2015-08-17 19:53:17

解决方案1
1 已采纳 2015-08-17 19:53:17