简体   繁体   中英

making scan works like Get in HBase

I am just writing a simple big data application on my existing data o HBase, sometime I feel that Scan could work faster than one Get, so I want to experiment it and convert my Get commands to the exact scan

Therefore if I have below keys and would like to Get(12)

row keys
12
123
21
22

what I eed to put as Startrow and Stoprow of my scan or I might configure other paramter in scan?

If the direction of the scan is default (ie not reversed), then what normally works for me is to set the start row as 12 and the stop row as 12x, where x is some trailing special character that you know doesn't occur in your row key space and is likely to be lexicographically later after all possible characters in your row key range. For example i usually use '~' as the trailing symbol, but maybe something else might work better for you.

Also, scan has the .setLimit(int) parameter, which can limit your scan to just 1. You can use both elements together. However I'm not sure why this should work faster than Get.

If you feel that your scans work faster than gets, maybe it has something to do with Call Queue configuration of your cluster. For example maybe your cluster is configured to allocate more handlers to Scans rather than Gets. That's not default behavior, but it's possible that someone may have configured it that way, and if your cluster is very busy, maybe that's why you are feeling it that way.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM