How to get specific rows in Hbase?

Question

My rowKeys in HBase like this;

a1s1
a1s2
a1s3
a2s1
a3s1
a3s2
...

I want to get only these data;

a1s1
a2s1
a3s1

But when I run thise query; scan 't1', {STARTROW=>'a1s1', ENDROW=>'a4s1'}

It gives me;

a1s1
a1s2
a1s3
a2s1
a3s1

But I don't want to get a1s2 and a1s3 . How can I do this?

Answer 1

You should use STARTROW-ENDROW and another filter with RegexStringComparator. If you use only start-end row filter, hbase performs this filtration for each character in your rowkey. Because rowkey is not numeric. In Hbase shell you can try this:

import org.apache.hadoop.hbase.filter.CompareFilter

import org.apache.hadoop.hbase.filter.RegexStringComparator

scan 't1', {STARTROW => 'a1s1', ENDROW => 'a4s1', FILTER => org.apache.hadoop.hbase.filter.RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'),RegexStringComparator.new("s1$"))}

Answer 2

I assume, you want to get the row key starting with "a*" and ending with "s1".

So either you can use below:

 scan 't1', { ENDROW=>'s1'}

Or

scan 't1', {STARTROW=>'a', ENDROW=>'s1'}

Another option is using regexString:

scan 't1', {FILTER => "RowFilter(=, 'regexstring:*s1')"}

How to get specific rows in Hbase?

Question

2 answers

solution1
1 ACCPTED 2019-03-14 08:10:17

solution2
0 2019-02-19 13:35:39

How to get specific rows in Hbase?

Question

2 answers

solution1 1 ACCPTED 2019-03-14 08:10:17

solution2 0 2019-02-19 13:35:39

solution1
1 ACCPTED 2019-03-14 08:10:17

solution2
0 2019-02-19 13:35:39