简体   繁体   English

Hbase按插入顺序扫描多个版本

[英]Hbase scan multiple versions in insertion order

I want to do a scan on hbase table for 10 versions. 我想在hbase表上扫描10个版本。 But, the result gives me data in latest to oldest order. 但是,结果给了我最新到最旧的数据。 I want to get it in the reverse order. 我想以相反的顺序得到它。 Is there a way to do that? 有没有办法做到这一点?

Example : 范例:
If I put data in 'test' table in the following order : 如果我按以下顺序将数据放入“测试”表中:

put 'test','1','data:a','v0'
put 'test','1','data:a','v1'
put 'test','1','data:a','v2'

Scanning 3 versions gives me following order : 扫描3个版本给我以下命令:

scan 'test',{VERSIONS=>3}
ROW COLUMN+CELL
1  column=data:a, timestamp=1537869886607, value=v2
1  column=data:a, timestamp=1537869884212, value=v1
1  column=data:a, timestamp=1537869881996, value=v0

I want to get the result in reverse order. 我想以相反的顺序得到结果。

My full usecase is, to scan and put, so if I get the result in latest to oldest order, I will be writing in the reverse order, when I do put. 我的全部用例是扫描和放置,因此,如果我得到的结果是最新到最旧的顺序,那么当我放置时,我将以相反的顺序编写。
Code is here : 代码在这里:

Scan scan = new Scan();
scan.setCacheBlocks(false);
scan.setCaching(10000);
scan.setMaxVersions(10);
ResultScanner scanner = tableGet.getScanner(scan);
for (Result result = scanner.next(); result != null; result = scanner.next()) {
  String row = new String(result.getRow());
  Put put = new Put(Bytes.toBytes(row));
  String key = "KEY" + ";" + row;
  for (Cell cell : result.rawCells()) {
    String family = Bytes.toString(CellUtil.cloneFamily(cell));
    String column = Bytes.toString(CellUtil.cloneQualifier(cell));
    byte[] value = CellUtil.cloneValue(cell);
    put.addColumn(family.getBytes(), column.getBytes(), value);
  }
  tablePut.put(put);
}

You can put the records with reverse order timestamps following 2 approaches: 您可以通过以下两种方法将记录放置在带有逆序时间戳的位置:

  1. Put the row into HBase with explicit timestamp values. 将行放入带有明确时间戳记值的HBase中。

Doing a put always creates a new version of a cell, at a certain timestamp. 进行放置操作总是会在某个时间戳记下创建单元的新版本。 By default the system uses currentTimeMillis, but you can specify the timestamp (= the long integer) yourself, on a per-column level. 默认情况下,系统使用currentTimeMillis,但是您可以自己在每个列级别上指定时间戳(=长整数)。 This means you could assign a time in the past or the future, or use the long value for non-time purposes. 这意味着您可以分配过去或将来的时间,或将long值用于非时间目的。

Initialize the timestamp value as: 将时间戳值初始化为:

long timestamp = Long.MAX_VALUE - System.currentTimeMillis()
Put put = new Put(Bytes.toBytes(rowKey), timestamp);
put.add(Bytes.toBytes(family), Bytes.toBytes(qualifier), Bytes.toBytes(value.toString()));
table.put(put);

Reference: https://hbase.apache.org/1.1/apidocs/org/apache/hadoop/hbase/client/Put.html#Put(byte[],%20long) https://www.ngdata.com/bending-time-in-hbase/ 参考: https : //hbase.apache.org/1.1/apidocs/org/apache/hadoop/hbase/client/Put.html#Put(byte[],%20long) https://www.ngdata.com/bending -hbase-time /

  1. Use Hashmap with Key as: "family_column" string (column family and column name concatenated with '|' or '_') and value LinkedList of values. Hashmap与Key一起使用:“ family_column”字符串(列族和列名以'|'或'_'串联)和值LinkedList值。

    HashMap<String, LinkedList> values = new HashMap<String, LinkedList>()

Insert the values as LinkedList with the Key. 用Key将值插入为LinkedList。 After the for loop, iterate the HashMap and for every element in HashMap, get the Value which is LinkedList and Reverse the LinkedList using: 在for循环之后,迭代HashMap,并为HashMap中的每个元素获取值为LinkedList的值,并使用以下方法反转LinkedList:

Collections.reverse(list)

Now iterate through the reversed list and Put the elements in HBase. 现在遍历反向列表并将元素放入HBase。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM