简体   繁体   中英

How much NameNode can be there in a single hadoop cluster?

Hadoop cluster is a collection of racks. Do each rack contains one NameNode or only one NameNode is present for the entire cluster?

In a typical Hadoop deployment, you would not have one NameNode per rack. Many smaller-scale deployments use one NameNode, with an optional Standby NameNode for automatic failover.

However, you can have more than one NameNode. Version 0.23 of Hadoop introduced federated NameNodes to allow for horizontal scaling. But, like I said, in many of the common use cases, you would have one NameNode per cluster (with optional Standby NameNode or Secondary NameNode).

See here for some more info.

It depends on the configuration of racks as well as Name Node too. you can have 1 Name Node for entire cluster. If u are serious about the performance, then you can configure another Name Node for other set of racks. But 1 Name Node per Rack is not advisable. In Hadoop 1.x you can have only one name node(Only one Namespace) but in Hadoop 2.x we can have namespace federation where we can have multiple name nodes usually serving for particular metadata only.

One. You can have only a single name node in a cluster.

Detail - In Yarn / Hadoop 2.0 they have come with a concept of active name node and standby name node. ( This is where most of the people get confused. They consider them to be 2 nodes in a cluster). But in this yarn architecture also there will be a single name node which will be receiving heartbeat and block report from data node. Which means there will be a single name node which will remain active. While this stand by name node will receive a meta data file from active name node via journal node so that in case of name node failure it can take over.

Now in case if you are having a cluster of large number of nodes say 2000 node then in that case also you can have only one Active name node or you can have another approach of dividing your cluster in sub cluster now these sub cluster will also have one Active node per cluster but this will increase processing speed because now your name node to data nodes ratio is better

Conclusion - in any case you can have one node per cluster

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM