简体   繁体   中英

Hdfs shows the list of the local files

I installed the Hadoop in the OS X and it was going well. My experience is recent and putting effort to learn more about application development with Hadoop.

Yesterday, when I need to look for the list of directories and/or files in the Hadoop, I can just type

$ hadoop fs -ls 

and, it would show me all the contents from the cluster.

Today, it shows all the local content in the file system. I have to provide the exact address of the hdfs to get the list of the contents,

$ hadoop fs -ls hdfs://localhost:8020/user/myName

My core-site.xml file is the same as before,

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<!-- Put site-specific property overrides in this file. -->
 <configuration>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/usr/local/Cellar/hadoop/hdfs/tmp</value>
        <description>A base for other temporary directories.</description>
    </property>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://localhost:8020</value>
    </property>
</configuration>

I stopped the cluster and again re-formatted the distributed file system with the below command before starting the hadoop daemons so that we can put our data sources into the hdfs file system while performing the map-reduce job

$ hdfs namenode -format

I get the admin report informs that the FileSystem file:/// is not an HDFS file system,

$ hadoop dfsadmin -report
WARNING: Use of this script to execute dfsadmin is deprecated.
WARNING: Attempting to execute replacement "hdfs dfsadmin" instead.

2018-10-18 18:01:27,316 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
report: FileSystem file:/// is not an HDFS file system
Usage: hdfs dfsadmin [-report] [-live] [-dead] [-decommissioning] [-enteringmaintenance] [-inmaintenance]

In the core-site.xml file, I also updated the configuration to the following,

<property>
    <!-- <name>fs.default.name</name> -->
    <!-- <value>hdfs://localhost:8020</value> -->
    <name>fs.defaultFS</name>
    <value>hdfs://localhost.localdomain:8020/</value>
</property>

I have reformated then and this is not changing a thing. As the other answer mentioned, the haddop home is already provided in the ~/.bashrc file,

export HADOOP_HOME=/Users/chaklader/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

How do I switch to the HDFS file system? Any kind of advice will be appreciated.

You'll want to ensure you've added an environment variable called HADOOP_CONF_DIR to be set to the directory containing the XML files from Hadoop.

You can do that in the .bashrc in the home folder


Otherwise, you would get the default filesystem, file:// , which is still valid and still works fine for running MapReduce jobs


FWIW, here is my core-site

$ cat /usr/local/Cellar/hadoop/3.1.1/libexec/etc/hadoop/core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>file:///tmp/hadoop/hdfs/tmp</value>
        <description>A base for other temporary directories.</description>
    </property>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:9000</value>
  </property>
</configuration>

And hdfs site

$ cat /usr/local/Cellar/hadoop/3.1.1/libexec/etc/hadoop/hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
  <property>
      <name>dfs.namenode.name.dir</name>
      <value>file:///tmp/hadoop/hdfs/names</value>
  </property>
  <property>
    <name>fs.checkpoint.dir</name>
    <value>file:///tmp/hadoop/hdfs/checkpoint</value>
  </property>
  <property>
    <name>fs.checkpoint.edits.dir</name>
    <value>file:///tmp/hadoop/hdfs/checkpoint-edits</value>
  </property>
  <property>
      <name>dfs.datanode.data.dir</name>
      <value>file:///tmp/hadoop/hdfs/data</value>
  </property>
</configuration>

Edit as below in your core-site.xml file.

<value>hdfs://localhost.localdomain:8020/</value>

I believe missing the slash (8020/) would have created the issue. Try it.

We had the same problem on a Cloudera Data Platform 7.1.5 edge node (a node with no masters nor slaves: Only clients and the Cloudera Manager). HDFS files were shown normally on every cluster node except on the edge node where the local filesystem was shown. The solution was to install the Gateway Role on our edge node as stated at https://community.cloudera.com/t5/Cloudera-Manager-Installation/quot-Hadoop-fs-ls-quot-Produces-the-Local-Filesystem-s-quot/td-p/4743 (Thanks for the link, Jim Todd )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM