简体   繁体   中英

Row Range Filter vs Substring comparator - Hbase

My Hbase rowkeys are set up like this: timestamp-userid

I need to scan through all the rows in hbase and return anything with userid = 38356644322545651

So we have

vid = "38356644322545651";

At the moment I'm using a little hack, a substring comparator:

Scan s = new Scan();
Filter f = new RowFilter(CompareOp.EQUAL, new SubstringComparator(vid));
s.setFilter(f);

This works perfectly!

However, I question the efficiency of checking for the existence of a substring. Also in the future if there were other rowkeys containing the above rowkey that could cause problems.

So I found something called a MultiRowRangeFilter .

It seems pretty straight-forward. My implementation is as follows:

Scan s = new Scan();
List<MultiRowRangeFilter.RowRange> lst = new ArrayList<MultiRowRangeFilter.RowRange>();
lst.add(new MultiRowRangeFilter.RowRange("0-" + vid, true, "z-" + vid, true));
s.setFilter(new MultiRowRangeFilter(lst));

This doesn't seem to work at all. Any ideas?

Simply to say, MultiRowRangeFilter is not suit for your scenario.
If worried about the efficiency and correctness, I recommend RegexStringComparator :

    int len = String.valueOf(System.currentTimeMillis()).length();
    String expr = "^[0-9]{" + len + "}" + String.valueOf(seperator) + vid + "$";

    // just kidding... not rely on flag at all.. use 0
    int flag = Pattern.CASE_INSENSITIVE | Pattern.DOTALL;
    RegexStringComparator.EngineType engineType = RegexStringComparator.EngineType.JAVA;

    RowFilter rowFilter = new RowFilter(CompareFilter.CompareOp.EQUAL,
            new RegexStringComparator(expr, flag, engineType));

If want to try with MultiRowRangeFilter , start key should be 0000000000000-vid , end key should be 9999999999999-vid , the code like below:

    int len = String.valueOf(System.currentTimeMillis()).length();
    String startPrefix = getStrOfRepeatedChar(len, '0'),
            endPrefix = getStrOfRepeatedChar(len, '9');

    String startRow = startPrefix + String.valueOf(seperator) + wantedId,
            endRow = endPrefix + String.valueOf(seperator) + wantedId;
    RowRange rowRange = new RowRange(startRow, true, endRow, true);

    List<RowRange> rowRangeList = new ArrayList<>();
    rowRangeList.add(rowRange);

    Filter multiRowRangeFilter = new MultiRowRangeFilter(rowRangeList);

But the result is still incorrect as it will display all results in the table.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM