简体繁体 English

从Pyspark访问HDFS失败

[英]Accessing HDFS from Pyspark fails

原文 2017-09-21 12:38:06 5 1 ubuntu/ hadoop/ apache-spark/ pyspark/ hdfs

I have installed Hadoop 2.7.3 and pyspark 2.2.0 on Ubuntu 17.04. 我已经在Ubuntu 17.04上安装了Hadoop 2.7.3和pyspark 2.2.0。

Both Hadoop and Pyspark seem to work properly on their own. Hadoop和Pyspark似乎都可以正常工作。 However, I did not manage to get files from HDFS in Pyspark. 但是，我没有设法从Pyspark的HDFS中获取文件。 When I try to get a file from HDFS I get the following error: 当我尝试从HDFS获取文件时，出现以下错误：

https://imgur.com/j6Dy2u7 https://imgur.com/j6Dy2u7

I read in another post that the environmental variable HADOOP_CONF_DIR needs to be set to access the HDFS. 我在另一篇文章中读到，需要将环境变量HADOOP_CONF_DIR设置为访问HDFS。 I also did that (see next screenshot), but then I get another error and Pyspark doesn't work any more. 我也这样做了（请参阅下一个屏幕截图），但是随后又出现另一个错误，Pyspark不再起作用。

https://imgur.com/AMpJ6TB https://imgur.com/AMpJ6TB

If I delete the Environmental Variable, everything works as before. 如果我删除了环境变量，那么一切都会像以前一样工作。

How can I fix the issue to open files from HDFS in Pyspark? 如何解决从Pyspark中的HDFS打开文件的问题？ I have spent a long time on that and would highly appreciate any help! 我已经花了很长时间了，非常感谢您的帮助！

1 个解决方案

尽管这个答案有点晚，但是您应该使用hdfs:///test/PySpark.txt （注意3 / s）。

使用HDFS中的文件到Apache Spark - Using files from HDFS into Apache Spark

无法将文件从本地磁盘复制到HDFS - Unable to copy files from local disk to HDFS

从本地文件上载hdfs上的数据时出错 - error while uploading data on hdfs from local file

如何将文件从本地文件系统复制到HDFS文件系统？ - How to copy files from local file system to HDFS file system?

从wifi访问以太网IP - Accessing ethernet ip from wifi

从Ubuntu Bash访问MySQL - Accessing MySQL from Ubuntu Bash

从终端访问phpmyadmin数据库 - Accessing phpmyadmin database from terminal

从 inte.net 上的服务器访问网站 - Accessing website from server on internet

从Mono访问Ubuntu终端 - Accessing the Ubuntu terminal from Mono

将csv日志文件从Windows Server转储到ubuntu VirtualBox / hadoop / hdfs - Dumping csv logs files from windows server to ubuntu VirtualBox/hadoop/hdfs

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用HDFS中的文件到Apache Spark - Using files from HDFS into Apache Spark 无法将文件从本地磁盘复制到HDFS - Unable to copy files from local disk to HDFS 从本地文件上载hdfs上的数据时出错 - error while uploading data on hdfs from local file 如何将文件从本地文件系统复制到HDFS文件系统？ - How to copy files from local file system to HDFS file system? 从wifi访问以太网IP - Accessing ethernet ip from wifi 从Ubuntu Bash访问MySQL - Accessing MySQL from Ubuntu Bash 从终端访问phpmyadmin数据库 - Accessing phpmyadmin database from terminal 从 inte.net 上的服务器访问网站 - Accessing website from server on internet 从Mono访问Ubuntu终端 - Accessing the Ubuntu terminal from Mono 将csv日志文件从Windows Server转储到ubuntu VirtualBox / hadoop / hdfs - Dumping csv logs files from windows server to ubuntu VirtualBox/hadoop/hdfs

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM