简体   繁体   中英

hadoop - HDFS file distribution

I just started to play with Hadoop and I'm having the following doubt: We know well that the Namenode has "MetaData" information about the input blocks. Now my questions are:

  1. How can I view or query the metadata?
  2. How can I see - how my input file is blocked and distributed?
  3. How can I make sure that my input file is blocked and distributed in the HDFS?

PS: I have referred the following site already:

http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/

Thanks!

  1. How can I view or query the metadata?

    You can do that with the help of Offline Image Viewer . It is a tool to dump the contents of fsimage files to human-readable formats in order to allow offline analysis and examination of an Hadoop cluster's namespace.

    Usage :

    bin/hdfs oiv -i fsimage -o fsimage.txt

    You can find more on this here .

  2. How can I see - how my input file is blocked and distributed?

    Easiest way would be to point your web browser to HDFS webUI , ie namemnode_machine:50070 . Then browse to the file in question and click to open it. Scroll down and you can see the location of each block of this file.

    Alternatively, you can use getFileBlockLocations(FileStatus file, long start, long len) provided by FileSystem API . It return an array containing hostnames, offset and size of portions of the given file.

  3. How can I make sure that my input file is blocked and distributed in the HDFS?

    You can use fsck to do that. It will show you all the necessary stuff, like Total blocks, Minimally replicated blocks, Under-replicated blocks etc related to a particular file.

Namenode's metadata is stored in a file called "fsimage". You can go through the below link for your reference

Content of the fsimage hdfs

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM