简体   繁体   English

HBase (Easy):如何在 hbase shell 中执行范围前缀扫描

[英]HBase (Easy): How to Perform Range Prefix Scan in hbase shell

I am designing an app to run on hbase and want to interactively explore the contents of my cluster.我正在设计一个在 hbase 上运行的应用程序,并希望以交互方式探索我的集群的内容。 I am in the hbase shell and I want to perform a scan of all keys starting with the chars "abc".我在 hbase shell 中,我想扫描所有以字符“abc”开头的键。 Such keys might inlcude "abc4", "abc92", "abc20014" etc... I tried a scan这些键可能包括“abc4”、“abc92”、“abc20014”等......我尝试了扫描

hbase(main):003:0> scan 'mytable', {STARTROW => 'abc', ENDROW => 'abc'}

But this does not seem to return anything since there is technically no rowkey "abc" only rowkeys starting with "abc"但这似乎没有返回任何内容,因为技术上没有行键“abc”,只有以“abc”开头的行键

What I want is something like我想要的是类似的东西

hbase(main):003:0> scan 'mytable', {STARTSROWPREFIX => 'abc', ENDROWPREFIX => 'abc'}

I hear HBase can do this quickly and is one of its main selling points.我听说 HBase 可以快速做到这一点,这是它的主要卖点之一。 How do I do this in the hbase shell?我如何在 hbase shell 中执行此操作?

So it turns out to be very easy.所以这很容易。 The scan ranges are not inclusive, the logic is start <= key < end.扫描范围不包括在内,逻辑是 start <= key < end。 So the answer is所以答案是

scan 'mytable', {STARTROW => 'abc', ENDROW => 'abd'}

In recent versions of HBase you can now do in the hbase shell:在最新版本的 HBase 中,您现在可以在 hbase shell 中执行以下操作:

scan 'mytable', {ROWPREFIXFILTER => 'abc'}

This effectively does this (and also works for binary situations)这有效地做到了这一点(也适用于二元情况)

scan 'mytable', {STARTROW => 'abc', ENDROW => 'abd'}

This method is a LOT more efficient than the "PrefixFilter" approach because the latter puts all records through the comparison code the is present in this PrefixFilter class.这种方法比“PrefixFilter”方法效率高很多,因为后者将所有记录放在此 PrefixFilter 类中存在的比较代码中。

The accepted solution won't work in all cases (binary keys).接受的解决方案不适用于所有情况(二进制密钥)。 In addition, using a PrefixFilter can be slow because it performs a table scan until it reaches the prefix.此外,使用 PrefixFilter 可能会很慢,因为它会执行表扫描,直到到达前缀为止。 A more performant solution is to use a STARTROW and a FILTER like so:更高效的解决方案是使用 STARTROW 和 FILTER ,如下所示:

 scan 'my_table', {STARTROW => 'abc', FILTER => "PrefixFilter('abc')"}

I think what you need is a filter我想你需要的是一个过滤器

checkout the answer for following question Scan with filter using HBase shell检查以下问题的答案使用 HBase shell 使用过滤器扫描

more filters are listed in http://hbase.apache.org/book/client.filter.html更多过滤器列在http://hbase.apache.org/book/client.filter.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM