[英]How to use a HBase secondary index table as and input in a MapReduce Job?
I new to HBase, I have a main table with rowkey =id-YYYYMMDD, and a secondary index table with rowkey =YYYYMMDD-id and a column with the rowkey in the main table. 我是HBase的新手,我有一个主表,其中包含rowkey = id-YYYYMMDD,以及一个带有rowkey = YYYYMMDD-id的二级索引表和一个带有主表中rowkey的列。 I will have about 1 million ids in the near future and I will need to create a MapReduce job to summarize the id in a given date (YYYYMMDD).
我将在不久的将来拥有大约100万个ID,我将需要创建一个MapReduce作业来总结给定日期的ID(YYYYMMDD)。
How do I pass the secondary index table to the mapreduce job so the corresponding "get(rowkey)" are run in the main table to get the columns and sumarize the data? 如何将二级索引表传递给mapreduce作业,以便在主表中运行相应的“get(rowkey)”以获取列并对数据进行sumarize?
You have 2 options: 你有2个选择:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.