Apache Spark中称为输入路径的错误不存在

Question

Apache Spark中是否有任何算法可以找出文本文件中的频繁模式。 我尝试了以下示例，但始终会遇到此错误：

org.apache.hadoop.mapred.InvalidInputException：输入路径不存在：file：/D:/spark-1.3.1-bin-hadoop2.6/bin/data/mllib/sample_fpgrowth.txt

谁能帮我解决这个问题？

import org.apache.spark.mllib.fpm.FPGrowth

val transactions = sc.textFile("...").map(_.split(" ")).cache()

val model = new FPGrowth()

model.setMinSupport(0.5)

model.setNumPartitions(10)

model.run(transactions)

model.freqItemsets.collect().foreach { 
    itemset => println(itemset.items.mkString("[", ",", "]") + ", " + itemset.freq)
}

Answer 1

尝试这个

file://D:/spark-1.3.1-bin-hadoop2.6/bin/data/mllib/sample_fpgrowth.txt

要么

D:/spark-1.3.1-bin-hadoop2.6/bin/data/mllib/sample_fpgrowth.txt

如果不起作用，请用//替换/

Answer 2

我假设您正在Windows上运行Spark。

使用类似的文件路径

D:\spark-1.3.1-bin-hadoop2.6\bin\data\mllib\sample_fpgrowth.txt

注意：如有必要，请转义“ \\”。

Apache Spark中称为输入路径的错误不存在

问题描述

2 个解决方案

解决方案1
0 2015-09-12 16:33:50

解决方案2
0 2015-09-14 09:00:12

Apache Spark中称为输入路径的错误不存在

问题描述

2 个解决方案

解决方案1 0 2015-09-12 16:33:50

解决方案2 0 2015-09-14 09:00:12

解决方案1
0 2015-09-12 16:33:50

解决方案2
0 2015-09-14 09:00:12