简体   繁体   中英

How to get specific rows in Hbase?

My rowKeys in HBase like this;

a1s1
a1s2
a1s3
a2s1
a3s1
a3s2
...

I want to get only these data;

a1s1
a2s1
a3s1

But when I run thise query; scan 't1', {STARTROW=>'a1s1', ENDROW=>'a4s1'}

It gives me;

a1s1
a1s2
a1s3
a2s1
a3s1

But I don't want to get a1s2 and a1s3 . How can I do this?

You should use STARTROW-ENDROW and another filter with RegexStringComparator. If you use only start-end row filter, hbase performs this filtration for each character in your rowkey. Because rowkey is not numeric. In Hbase shell you can try this:

import org.apache.hadoop.hbase.filter.CompareFilter

import org.apache.hadoop.hbase.filter.RegexStringComparator

scan 't1', {STARTROW => 'a1s1', ENDROW => 'a4s1', FILTER => org.apache.hadoop.hbase.filter.RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'),RegexStringComparator.new("s1$"))}

I assume, you want to get the row key starting with "a*" and ending with "s1".

So either you can use below:

 scan 't1', { ENDROW=>'s1'}

Or

scan 't1', {STARTROW=>'a', ENDROW=>'s1'}

Another option is using regexString:

scan 't1', {FILTER => "RowFilter(=, 'regexstring:*s1')"}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM