简体   繁体   中英

Hbase read high load

I'm in research process for noSQL solution for our company needs. For now the search narrows to hBase. I've read a lot about architecture, performance etc, but one thing is still uncovered for me.

For example if you have 100 nodes cluster, and one row gets 100.000 simultaneous requests. In this case all the 100.000 requests will hit only one node, where the row is stored? As I understand HBase replication is only for data backup (not for read load balance), and there no any master/slave mechanism (like in MySQL)?

Regarding to 100.000 concurrent requests for single row - I think nobody is good for this currently. Under normal condition it is simply not needed - clients are anyway isolated from DB so access is limited in this case (and probably cached).

Regarding to storage and replication. First, there is at least 2 types of replication and actually it is not HBase. HBase relies on HDFS which is fault tolerant by nature. Read about HBase master and HBase region server role if you need to understand details but in general all things related to replication go to HDFS.

HBase replication is not only for data backup, also availability. As that does not seem to be the only point you cover with your question here I pointed you to that link where you can find more information. If you have specific questions regarding your schema design you should start in the home page of the Apache hosted project first of all. For the last question mark about master/slave, that URL I sent still applies (And you can ask the HBase developers about it if you are unsure anyway): http://hbase.apache.org/replication.html

我猜想100,000个并发请求在HBase上不能很好地工作,但是实际情况似乎很好, yfrog每秒获得10K请求,eBay为新版本的产品搜索引擎以及Facebook为其消息传递系统 You 选择了它也可以在更适中的集群上查看hstack基准测试

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM