繁体 English 中英

将高容量减速器输出写入HBase

[英]Writing high volume reducer output to HBase

原文 2014-02-13 00:24:56 2 2 hadoop/ hbase

我有一个Hadoop MapReduce作业，其输出是一个row-id，对该行id具有Put / Delete操作。 由于问题的性质，输出量相当高。 我们已经尝试了几种方法将这些数据恢复到HBase并且它们都失败了......

表减速机

这是缓慢的方式，因为它似乎必须为每一行进行完整的往返。 由于键对我们的reducer步骤进行排序，因此row-id不可能与reducer在同一节点上。

completebulkload

这似乎需要很长时间（永远不会完成），并且没有真正的迹象表明原因。 IO和CPU都显示出非常低的使用率。

我错过了一些明显的东西吗

2 个解决方案

我从你的回答中看到你解决了你的问题但是为了完整性我会提到另一种选择 - 直接写入hbase。 我们有一个设置，我们将数据流式传输到HBase，并通过适当的密钥和区域分割，我们得到每个节点每秒超过15,000个1K记录

CompleteBulkLoad是正确的答案。 每个@DonaldMiner我深入挖掘并发现CompleteBulkLoad进程作为“hbase”运行，在尝试移动/重命名/删除源文件时导致权限被拒绝错误。 在给出错误消息之前，该实现似乎重试了很长时间; 在我们的案例中长达30分钟。

为hbase用户提供对文件的写访问权解决了该问题。

输出HBase增量MR减速器

[英]output HBase Increment in MR reducer

将数据从HBase迁移到FileSystem。（将Reducer输出写入本地或Hadoop文件系统）

[英]Migrating Data from HBase to FileSystem. (Writing Reducer output to Local or Hadoop filesystem)

Hadoop映射器输出到HBase表和化简器

[英]Hadoop mapper output to HBase table and a reducer

HBase批量加载产生大量的reducer任务-任何解决方法

[英]HBase bulk load spawn high number of reducer tasks - any workaround

Reducer将Mapper输出写入输出文件

[英]Reducer Writing Mapper Output into Output file

Hadoop：Reducer将Mapper输出写入输出文件

[英]Hadoop: Reducer writing Mapper output into Output File

Hadoop Map-Reducer未写入任何输出

[英]Hadoop map-reducer is not writing any output

HBase mapreduce：在Reducer中写入HBase

[英]HBase mapreduce: write into HBase in Reducer

每个HBase表的Reducer

[英]a Reducer per HBase table

在Reducer中从HBase读取数据

[英]Reading data from HBase in Reducer

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 输出HBase增量MR减速器将数据从HBase迁移到FileSystem。（将Reducer输出写入本地或Hadoop文件系统） Hadoop映射器输出到HBase表和化简器 HBase批量加载产生大量的reducer任务-任何解决方法 Reducer将Mapper输出写入输出文件 Hadoop：Reducer将Mapper输出写入输出文件 Hadoop Map-Reducer未写入任何输出 HBase mapreduce：在Reducer中写入HBase 每个HBase表的Reducer 在Reducer中从HBase读取数据

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM