My rowKeys in HBase like this;
a1s1
a1s2
a1s3
a2s1
a3s1
a3s2
...
I want to get only these data;
a1s1
a2s1
a3s1
But when I run thise query; scan 't1', {STARTROW=>'a1s1', ENDROW=>'a4s1'}
It gives me;
a1s1
a1s2
a1s3
a2s1
a3s1
But I don't want to get a1s2 and a1s3 . How can I do this?
You should use STARTROW-ENDROW and another filter with RegexStringComparator. If you use only start-end row filter, hbase performs this filtration for each character in your rowkey. Because rowkey is not numeric. In Hbase shell you can try this:
import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.RegexStringComparator
scan 't1', {STARTROW => 'a1s1', ENDROW => 'a4s1', FILTER => org.apache.hadoop.hbase.filter.RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'),RegexStringComparator.new("s1$"))}
I assume, you want to get the row key starting with "a*" and ending with "s1".
So either you can use below:
scan 't1', { ENDROW=>'s1'}
Or
scan 't1', {STARTROW=>'a', ENDROW=>'s1'}
Another option is using regexString:
scan 't1', {FILTER => "RowFilter(=, 'regexstring:*s1')"}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.