简体   繁体   中英

input path does not exist apache-spark

Am new in spark but i have been trying to access a file and i keep on getting the same error no matter how much i tweak the code for locating the text file on my computer

lines = sc.textFile(r"Documents/python-spark-tutorial/in/word_count.txt").collect()

Traceback (most recent call last): File "", line 1, in File "C:\\spark\\spark-2.4.4-bin-hadoop2.7\\python\\pyspark\\rdd.py", line 816, in collect sock_info = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd()) File "C:\\spark\\spark-2.4.4-bin-hadoop2.7\\python\\lib\\py4j-0.10.7-src.zip\\py4j\\java_gateway.py", line 1257, in call File "C:\\spark\\spark-2.4.4-bin-hadoop2.7\\python\\pyspark\\sql\\utils.py", line 63, in deco return f(*a, kw) File "C:\\spark\\spark-2.4.4-bin-hadoop2.7\\python\\lib\\py4j-0.10.7-src.zip\\py4j\\protocol.py", line 328, in get_return_value py4j.protocol.Py4JJavaError: ***An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe. : org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/C:/Users/Home/Documents/python-spark-tutorial/in/word_count.txt* at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)

试试下面的代码片段。

sc.textFile("file:///path")

我的问题解决了,这是我把txt而不是文本弄乱了的文件扩展名

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM