简体   繁体   中英

HBase Shell Almost 100x Faster Than Restful Endpoint For Prefix Filter

If I run a scan with a prefix filter on the HBase shell, I get a response in less than 1 second no matter what I use for a prefix. (0 vs 9 or "a" vs "z" makes no difference in speed of response).

However, when I make the same query from the Microsoft HBase library (in C#), it can take up to 90 seconds to get an answer. Interestingly, if I pick a prefix closer to 0, it's faster, the further I move from 0, the longer it takes. ("a" is quicker than "f" as a prefix filter).

Not sure how to determine what the shell is doing differently than the restful query or how to make the restful query more performant.

Some details:

  • There are a little over 20,000,000 records in this table
  • The row key is designed as [guid]_[inverse timestamp], eg a6fc9620-5ff0-41c0-9ed9-660bc3fbb65c_9223370501253811889

Any thoughts of what I should be looking for or trying to improve the rest api request?

Turns out this is a non-issue. I wasn't running the same commands on the shell vs the rest API like I thought.

On the rest API, I was giving two filters, a page filter and a prefix filter.

On the HBase shell I was running

scan 'beacon', {STARTROW => 'ff', FILTER => "PageFilter(25)"}

The STARTROW isn't the same as a prefix filter. It is actually doing something more like setting a full beginning row key, and thus make the scan performant as it's not traversing the whole table.

Turns out, this is what I should have been doing in the rest API call too. When I set a start and end row in addition to a prefix filter and page filter, it works quickly and as expected.

https://community.hortonworks.com/articles/55204/recommended-way-to-do-hbase-prefix-scan-through-hb.html

Should I use prefixfilter or rowkey range scan in HBase

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM