本地文件系统上的Hadoop

Question

I'm running Hadoop on a pseudo-distributed. 我在伪分布式上运行Hadoop。 I want to read and write from Local filesystem by Abstracting the HDFS for my job. 我想通过为我的工作提取HDFS来从本地文件系统中读取和写入。 Am using the file:/// parameter. 我使用的是file:///参数。 I followed this link. 我按照这个链接。

This is the file contents of core-site.xml , 这是core-site.xml的文件内容，

<configuration>
 <property>
  <name>hadoop.tmp.dir</name>
  <value> /home/abimanyu/temp</value>   
 </property>

 <property>
  <name>fs.default.name</name>
  <value>hdfs://localhost:54310</value>
 </property>

</configuration>

This is the file contents of mapred-site.xml , 这是mapred-site.xml的文件内容，

<configuration>

 <property>
  <name>mapred.job.tracker</name>
  <value>localhost:54311</value>
 </property>

 <property>
    <name>fs.default.name</name>
    <value>file:///</value>
 </property>

 <property>
    <name>mapred.tasktracker.map.tasks.maximum</name>
    <value>1</value>
 </property>

 <property>
    <name>mapred.tasktracker.reduce.tasks.maximum</name>
    <value>1</value>
 </property>

</configuration>

This is the file contents of hdfs-site.xml , 这是hdfs-site.xml的文件内容，

<configuration>

 <property>
  <name>dfs.replication</name>
  <value>1</value>
 </property>
</configuration>

This is the error I get when I try to start the demons(using start-dfs or start-all), 这是我尝试启动恶魔时使用的错误（使用start-dfs或start-all），

localhost: Exception in thread "main" java.lang.IllegalArgumentException: Does not contain a valid host:port authority: file:///
localhost:      at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
localhost:      at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:212)
localhost:      at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:244)
localhost:      at org.apache.hadoop.hdfs.server.namenode.NameNode.getServiceAddress(NameNode.java:236)
localhost:      at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:194)
localhost:      at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:150)
localhost:      at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:676)

What is strange to me is that this reading from local file system works completely fine in hadoop-0.20.2 but not in hadoop-1.2.1 . 对我来说奇怪的是，从本地文件系统读取的内容在hadoop-0.20.2完全正常，但在hadoop-1.2.1却没有。 Has anything changed from initial release to the later version ? 从初始版本到更高版本有什么变化吗？ Let me know how to read from Local File system for a Hadoop JAR. 让我知道如何从本地文件系统中读取Hadoop JAR。

Answer 1

You can remove the fs.default.name value from your mapred-site.xml file - this should only be in the core-site.xml file. 您可以从mapred-site.xml文件中删除fs.default.name值 - 这应该只在core-site.xml文件中。

If you want to run on your local file system, in a pseudo mode, this is typically achieved by running in what's called local mode - by setting the fs.default.name value in core-site.xml to file:/// (you currently have it configured for hdfs://localhost:54310). 如果要在本地文件系统上以伪模式运行，通常可以通过在所谓的本地模式下运行来实现 - 通过将core-site.xml中的fs.default.name值设置为file：///（您目前已将其配置为hdfs：// localhost：54310）。

The stack trace you are seeing is when the secondary name node is starting up - this isn't needed when running in 'local mode' as there is no fsimage or edits file for the 2NN to work against. 您看到的堆栈跟踪是辅助名称节点启动时 - 在“本地模式”下运行时不需要这样，因为没有用于2NN的fsimage或编辑文件。

Fix up your core-site.xml and mapred-site.xml. 修复core-site.xml和mapred-site.xml。 Stop all hadoop daemons and just start the map-reduce daemons (Job Tracker and Task Tracker). 停止所有hadoop守护进程，然后启动map-reduce守护进程（Job Tracker和Task Tracker）。

本地文件系统上的Hadoop

问题描述

1 个解决方案

解决方案1
6 已采纳 2013-11-23 14:53:08

本地文件系统上的Hadoop

问题描述

1 个解决方案

解决方案1 6 已采纳 2013-11-23 14:53:08

解决方案1
6 已采纳 2013-11-23 14:53:08