简体   繁体   English

在Hbase表上查询行键的后缀部分

[英]Query on Hbase table for postfix part of rowkey

I have an hbase table whose key is composite key part1_part2_part3 Now I want to query result for a keyword on part3 of rowkey. 我有一个hbase表,其键是组合键part1_part2_part3现在,我想查询rowkey的part3上一个关键字的结果。 So is there any optimal way of querying other than scanning all rows and checking existence of keyword in part3? 那么,除了扫描所有行并检查part3中关键字的存在之外,还有没有其他查询的最佳方法?

Have you tried using HBase filters ? 您是否尝试过使用HBase过滤器 If not, you could use RowFilter with SubstringComparator to achieve this. 如果没有,则可以将RowFilterSubstringComparator结合使用来实现。 This is how RowFilter is used : 这是RowFilter的使用方式:

public class RowFilterDemo {

    public static void main(String[] args) throws IOException {

        Configuration conf = HBaseConfiguration.create();
        HTable table = new HTable(conf, "demo_table");
        Scan s = new Scan();
        Filter f = new RowFilter(CompareOp.EQUAL, new SubstringComparator("_part3"));
        s.setFilter(f);
        ResultScanner rs = table.getScanner(s);
        for(Result r : rs){
            for (KeyValue kv : r.raw()){
                System.out.println("RowKey : " + Bytes.toString(r.getRow()));
                System.out.println("Qualifier : " + Bytes.toString(kv.getQualifier()));
                System.out.println("Value : " + Bytes.toString(kv.getValue()));
            }

        }
        rs.close();
        table.close();
    }
}

This will return all the rows whose rowkey contain _part3 . 这将返回其rowkey包含_part3的所有行。

Another approach could be to tweak your rowkey design a bit by reversing them and using PrefixFilter to fetch the data. 另一种方法是通过反转行键设计并使用PrefixFilter来获取数据来对其稍作调整。 Given a prefix, specified when you instantiate the filter instance, all rows that match this prefix are returned to the client. 给定一个实例化过滤器实例时指定的前缀,与此前缀匹配的所有行都将返回给客户端。

In that case the rowkey would be part3_part2_part1 . 在这种情况下,行键将是part3_part2_part1 And the code to fetch the data will be : 并且获取数据的代码将是:

Filter filter = new PrefixFilter(Bytes.toBytes("part3_"));
Scan scan = new Scan();
scan.setFilter(filter);
ResultScanner scanner = table.getScanner(scan); for (Result result : scanner) {
for (KeyValue kv : result.raw()) { 
    System.out.println("KV: " + kv + ", Value: " + Bytes.toString(kv.getValue())); }
} 
scanner.close();

This approach also gives you the ability to perform range scans using Scan.startRow() and Scan.stopRow() methods. 这种方法还使您能够使用Scan.startRow()Scan.stopRow()方法执行范围扫描 This will be much more efficient than using Filters. 这将比使用过滤器更为有效。

A more advanced approach would be to use HBase FuzzyRowFilter . 更高级的方法是使用HBase FuzzyRowFilter But in order to use it your rowkeys must be of same length . 但是,要使用它,您的行键必须具有相同的长度

- So is there any optimal way of querying other than scanning all rows and checking existence of keyword in part3? -因此,除了扫描所有行并检查part3中关键字的存在之外,还有没有其他查询的最佳方法?

Change your design if possible and use range queries.

HTH HTH

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM