简体   繁体   English

HDFS Hadoop中的数据分发

[英]Distribution of Data in HDFS Hadoop

I configured 3 datanodes on my Linux machine. 我在Linux机器上配置了3个datanode。 In my configuration, I configured the number of replication to be 1. 在我的配置中,我将复制数量配置为1。

I am submitting a file to the hdfs, and found that the file has 3 copies on each datanodes (I checked it from the browser) 我正在向hdfs提交文件,发现该文件在每个datanode上都有3个副本(我从浏览器中对其进行了检查)

Isn't right that I should only see the file on 1 datanodes and on 1 replica? 我应该只在1个数据节点和1个副本上看到文件吗?

Before going into the HDFS the file will be split into blocks and you should see one replica of each block on each datanode. 在进入HDFS之前,该文件将被拆分为多个块,您应该在每个数据节点上看到每个块的一个副本。 The file as a whole won't be present on any of the datanode. 整个文件不会出现在任何datanode上。

Please make sure that you have restarted HDFS daemons after changing the replication factor property in the hdfs-site.xml file. 在更改hdfs-site.xml文件中的复制因子属性后,请确保已重新启动HDFS守护程序。

Also It would be good if you can post your HDFS Console Snapshot. 另外,如果您可以发布HDFS控制台快照,那将是很好的。

I suspect that dfs.replication is set to 3 instead of 1 我怀疑dfs.replication设置为3而不是1

Make sure that below parameters are set to 1 in your hdfs-site.xml 确保在hdfs-site.xml中将以下参数设置为1

dfs.replication : Default block replication. dfs.replication :默认块复制。 The actual number of replications can be specified when the file is created. 创建文件时可以指定实际的复制数量。 The default is used if replication is not specified in create time 如果在创建时未指定复制,则使用默认值

dfs.namenode.replication.min : Minimal block replication. dfs.namenode.replication.min :最小块复制。

Have a look at documentation for more details. 请查看文档以了解更多详细信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM