简体   繁体   English

在HBase中使用Scan,包括起始行,结束行和过滤器

[英]Using Scan in HBase with start row, end row and a filter

I need to use a Scan in HBase for scanning all rows that meet certain criteria: that's the reason why I will use a filter (really a compound filter list that includes two SingleColumnValueFilter). 我需要使用HBase中的扫描来扫描符合特定条件的所有行:这就是我将使用过滤器的原因(实际上是包含两个SingleColumnValueFilter的复合过滤器列表)。 Now, I have my rowKeys structured in this way: 现在,我以这种方式构建了rowKeys:

a.b.x|1|1252525  
a.b.x|1|2373273  
a.b.x|1|2999238  
...  
a.b.x|2|3000320  
a.b.x|2|4000023  
...  
a.b.y|1|1202002  
a.b.y|1|1778949  
a.b.y|1|2738273  

and as an additional requirement, I need to iterate only those rows having a rowKey starting with "abx|1" 作为一个额外的要求,我只需要迭代那些具有以“abx | 1”开头的rowKey的行

Now, the questions 现在,问题

  1. if I use an additional PrefixFilter in my filter list does the scanner always scan all rows (and on each of them applies the filter)? 如果我在我的过滤器列表中使用额外的PrefixFilter,扫描程序是否总是扫描所有行(并且每个行都应用过滤器)?
  2. if I instantiate the Scan passing a startRow (prefix) and the filterlist (without the PrefixFilter), I understood that the scan starts from the given row prefix. 如果我实例化Scan传递startRow(前缀)和filterlist(没有PrefixFilter),我理解扫描从给定的行前缀开始。 So, assume I'm using an "abx" as startRow, does the scan will scan also the aby? 所以,假设我使用“abx”作为startRow,扫描是否也会扫描aby?
  3. What is the behaviour if I use new Scan(startRow, endRow) and then setFilter? 如果我使用新的Scan(startRow,endRow)然后使用setFilter会有什么行为? In any words: what about the missing constructor Scan(byte [] start, byte [] end, Filter filter)? 换句话说:缺少的构造函数Scan(byte [] start,byte [] end,Filter filter)怎么样?

Thanks in advance 提前致谢
Andrea 安德里亚

Row keys are sorted(lexical) in hbase. 行键在hbase中排序(词汇)。 Hence all the "abx|1"s would come before "abx|2"s and so on.. As rows keys are stored as byte arrays and are lexicographically sorted, be careful with non fixed length row keys and when you are mixing up different character classes. 因此所有“abx | 1”都将出现在“abx | 2”之前,依此类推。由于行键存储为字节数组并按字典顺序排序,因此请注意非固定长度的行键以及混合时不同的角色类。 But for your requirement something on this lines should work: 但是对于你的要求,这方面的东西应该有效:

Scan scan = new Scan(Bytes.toBytes("a.b.x|1"),Bytes.toBytes("a.b.x|2"); //creating a scan object with start and stop row keys

scan.setFilter(colFilter);//set the Column filters you have to this scan object.

//And then you can get a scanner object and iterate through your results
ResultScanner scanner = table.getScanner(scan);
for (Result result = scanner.next(); result != null; result = scanner.next())
{
    //Use the result object
}

update: ToBytes should be toBytes 更新:ToBytes应该是toBytes

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM