简体繁体 English

HBase读取高负载

[英]Hbase read high load

原文 2011-11-28 12:10:16 5 3 hadoop/ nosql/ hbase/ hdfs/ high-availability

I'm in research process for noSQL solution for our company needs. 我正在研究满足公司需求的noSQL解决方案。 For now the search narrows to hBase. 目前，搜索范围仅限于hBase。 I've read a lot about architecture, performance etc, but one thing is still uncovered for me. 我已经阅读了很多有关体系结构，性能等方面的内容，但是对于我来说仍然发现了一件事。

For example if you have 100 nodes cluster, and one row gets 100.000 simultaneous requests. 例如，如果您有100个节点集群，并且一行获得100.000个并发请求。 In this case all the 100.000 requests will hit only one node, where the row is stored? 在这种情况下，所有100.000个请求将仅命中存储行的一个节点？ As I understand HBase replication is only for data backup (not for read load balance), and there no any master/slave mechanism (like in MySQL)? 据我了解，HBase复制仅用于数据备份（不适用于读取负载平衡），并且没有任何主/从机制（如MySQL）？

3 个解决方案

Regarding to 100.000 concurrent requests for single row - I think nobody is good for this currently. 关于单行的100.000个并发请求-我认为目前没有人对此有好处。 Under normal condition it is simply not needed - clients are anyway isolated from DB so access is limited in this case (and probably cached). 在正常情况下，根本不需要这样做-客户端始终与DB隔离，因此在这种情况下（并且可能已缓存）访问受到限制。

Regarding to storage and replication. 关于存储和复制。 First, there is at least 2 types of replication and actually it is not HBase. 首先，至少有两种复制类型，实际上不是HBase。 HBase relies on HDFS which is fault tolerant by nature. HBase依赖于HDFS，它本质上是容错的。 Read about HBase master and HBase region server role if you need to understand details but in general all things related to replication go to HDFS. 如果您需要了解详细信息，请阅读有关HBase master HBase region server和HBase region server角色的信息，但总体而言，与复制有关的所有内容都应归于HDFS。

HBase replication is not only for data backup, also availability. HBase复制不仅用于数据备份，还用于可用性。 As that does not seem to be the only point you cover with your question here I pointed you to that link where you can find more information. 因为这似乎不是您在此处提出问题的唯一要点，所以我指出了指向该链接的链接，您可以在其中找到更多信息。 If you have specific questions regarding your schema design you should start in the home page of the Apache hosted project first of all. 如果您对架构设计有特殊疑问，则应首先从Apache托管项目的主页开始。 For the last question mark about master/slave, that URL I sent still applies (And you can ask the HBase developers about it if you are unsure anyway): http://hbase.apache.org/replication.html 对于有关主/从服务器的最后一个问号，我发送的URL仍然适用（如果不确定，您可以向HBase开发人员询问该URL）： http : //hbase.apache.org/replication.html