在Hbase表上查询行键的后缀部分

Question

I have an hbase table whose key is composite key part1_part2_part3 Now I want to query result for a keyword on part3 of rowkey. 我有一个hbase表，其键是组合键part1_part2_part3现在，我想查询rowkey的part3上一个关键字的结果。 So is there any optimal way of querying other than scanning all rows and checking existence of keyword in part3? 那么，除了扫描所有行并检查part3中关键字的存在之外，还有没有其他查询的最佳方法？

Answer 1

Have you tried using HBase filters ? 您是否尝试过使用HBase过滤器 ？ If not, you could use RowFilter with SubstringComparator to achieve this. 如果没有，则可以将RowFilter与SubstringComparator结合使用来实现。 This is how RowFilter is used : 这是RowFilter的使用方式：

public class RowFilterDemo {

    public static void main(String[] args) throws IOException {

        Configuration conf = HBaseConfiguration.create();
        HTable table = new HTable(conf, "demo_table");
        Scan s = new Scan();
        Filter f = new RowFilter(CompareOp.EQUAL, new SubstringComparator("_part3"));
        s.setFilter(f);
        ResultScanner rs = table.getScanner(s);
        for(Result r : rs){
            for (KeyValue kv : r.raw()){
                System.out.println("RowKey : " + Bytes.toString(r.getRow()));
                System.out.println("Qualifier : " + Bytes.toString(kv.getQualifier()));
                System.out.println("Value : " + Bytes.toString(kv.getValue()));
            }

        }
        rs.close();
        table.close();
    }
}

This will return all the rows whose rowkey contain _part3 . 这将返回其rowkey包含_part3的所有行。

Another approach could be to tweak your rowkey design a bit by reversing them and using PrefixFilter to fetch the data. 另一种方法是通过反转行键设计并使用PrefixFilter来获取数据来对其稍作调整。 Given a prefix, specified when you instantiate the filter instance, all rows that match this prefix are returned to the client. 给定一个实例化过滤器实例时指定的前缀，与此前缀匹配的所有行都将返回给客户端。

In that case the rowkey would be part3_part2_part1 . 在这种情况下，行键将是part3_part2_part1 。 And the code to fetch the data will be : 并且获取数据的代码将是：

Filter filter = new PrefixFilter(Bytes.toBytes("part3_"));
Scan scan = new Scan();
scan.setFilter(filter);
ResultScanner scanner = table.getScanner(scan); for (Result result : scanner) {
for (KeyValue kv : result.raw()) { 
    System.out.println("KV: " + kv + ", Value: " + Bytes.toString(kv.getValue())); }
} 
scanner.close();

This approach also gives you the ability to perform range scans using Scan.startRow() and Scan.stopRow() methods. 这种方法还使您能够使用Scan.startRow（）和Scan.stopRow（）方法执行范围扫描 。 This will be much more efficient than using Filters. 这将比使用过滤器更为有效。

A more advanced approach would be to use HBase FuzzyRowFilter . 更高级的方法是使用HBase FuzzyRowFilter 。 But in order to use it your rowkeys must be of same length . 但是，要使用它，您的行键必须具有相同的长度 。

- So is there any optimal way of querying other than scanning all rows and checking existence of keyword in part3? -因此，除了扫描所有行并检查part3中关键字的存在之外，还有没有其他查询的最佳方法？

Change your design if possible and use range queries.

HTH HTH

在Hbase表上查询行键的后缀部分

问题描述

1 个解决方案

解决方案1
1 2014-05-13 23:59:34

在Hbase表上查询行键的后缀部分

问题描述

1 个解决方案

解决方案1 1 2014-05-13 23:59:34

解决方案1
1 2014-05-13 23:59:34