简体   繁体   English

hadoop - HDFS文件分发

[英]hadoop - HDFS file distribution

I just started to play with Hadoop and I'm having the following doubt: We know well that the Namenode has "MetaData" information about the input blocks. 我刚刚开始使用Hadoop而且我有以下疑问:我们很清楚Namenode有关于输入块的“MetaData”信息。 Now my questions are: 现在我的问题是:

  1. How can I view or query the metadata? 如何查看或查询元数据?
  2. How can I see - how my input file is blocked and distributed? 我怎么看 - 我的输入文件是如何被阻止和分发的?
  3. How can I make sure that my input file is blocked and distributed in the HDFS? 如何确保我的输入文件在HDFS中被阻止和分发?

PS: I have referred the following site already: PS:我已经推荐了以下网站:

http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/ http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/

Thanks! 谢谢!

  1. How can I view or query the metadata? 如何查看或查询元数据?

    You can do that with the help of Offline Image Viewer . 您可以在Offline Image Viewer的帮助下完成此操作。 It is a tool to dump the contents of fsimage files to human-readable formats in order to allow offline analysis and examination of an Hadoop cluster's namespace. 它是一种将fsimage文件的内容转储为人类可读格式的工具,以便允许离线分析和检查Hadoop集群的命名空间。

    Usage : 用法:

    bin/hdfs oiv -i fsimage -o fsimage.txt bin / hdfs oiv -i fsimage -o fsimage.txt

    You can find more on this here . 你可以在这里找到更多相关信息

  2. How can I see - how my input file is blocked and distributed? 我怎么看 - 我的输入文件是如何被阻止和分发的?

    Easiest way would be to point your web browser to HDFS webUI , ie namemnode_machine:50070 . 最简单的方法是将您的Web浏览器指向HDFS webUI ,即namemnode_machine:50070 Then browse to the file in question and click to open it. 然后浏览到相关文件并单击以打开它。 Scroll down and you can see the location of each block of this file. 向下滚动,您可以看到此文件的每个块的位置。

    Alternatively, you can use getFileBlockLocations(FileStatus file, long start, long len) provided by FileSystem API . 或者,您可以使用FileSystem API提供的getFileBlockLocations(FileStatus文件,long start,long len) It return an array containing hostnames, offset and size of portions of the given file. 它返回一个数组,其中包含给定文件的主机名,偏移量和部分大小。

  3. How can I make sure that my input file is blocked and distributed in the HDFS? 如何确保我的输入文件在HDFS中被阻止和分发?

    You can use fsck to do that. 你可以使用fsck来做到这一点。 It will show you all the necessary stuff, like Total blocks, Minimally replicated blocks, Under-replicated blocks etc related to a particular file. 它将向您显示所有必要的内容,例如与特定文件相关的总块,最小复制块,欠复制块等。

Namenode's metadata is stored in a file called "fsimage". Namenode的元数据存储在名为“fsimage”的文件中。 You can go through the below link for your reference 您可以浏览以下链接以供参考

Content of the fsimage hdfs fsimage hdfs的内容

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM