[英]Distribution of Data in HDFS Hadoop
I configured 3 datanodes on my Linux machine. 我在Linux机器上配置了3个datanode。 In my configuration, I configured the number of replication to be 1. 在我的配置中,我将复制数量配置为1。
I am submitting a file to the hdfs, and found that the file has 3 copies on each datanodes (I checked it from the browser) 我正在向hdfs提交文件,发现该文件在每个datanode上都有3个副本(我从浏览器中对其进行了检查)
Isn't right that I should only see the file on 1 datanodes and on 1 replica? 我应该只在1个数据节点和1个副本上看到文件吗?
Before going into the HDFS the file will be split into blocks and you should see one replica of each block on each datanode. 在进入HDFS之前,该文件将被拆分为多个块,您应该在每个数据节点上看到每个块的一个副本。 The file as a whole won't be present on any of the datanode. 整个文件不会出现在任何datanode上。
Please make sure that you have restarted HDFS daemons after changing the replication factor property in the hdfs-site.xml file. 在更改hdfs-site.xml文件中的复制因子属性后,请确保已重新启动HDFS守护程序。
Also It would be good if you can post your HDFS Console Snapshot. 另外,如果您可以发布HDFS控制台快照,那将是很好的。
I suspect that dfs.replication
is set to 3
instead of 1 我怀疑dfs.replication
设置为3
而不是1
Make sure that below parameters are set to 1 in your hdfs-site.xml
确保在hdfs-site.xml
中将以下参数设置为1
dfs.replication
: Default block replication. dfs.replication
:默认块复制。 The actual number of replications can be specified when the file is created. 创建文件时可以指定实际的复制数量。 The default is used if replication is not specified in create time 如果在创建时未指定复制,则使用默认值
dfs.namenode.replication.min
: Minimal block replication. dfs.namenode.replication.min
:最小块复制。
Have a look at documentation for more details. 请查看文档以了解更多详细信息。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.