简体繁体 English

使用Hadoop YARN安装HBase

[英]HBase installation with Hadoop YARN

原文 2014-09-23 12:10:07 1 1 java/ hadoop/ hbase

I am trying to install HBase v0.98.6 on ubuntu. 我正在尝试在ubuntu上安装HBase v0.98.6。 I already have Hadoop YARN running on the OS. 我已经在操作系统上运行了Hadoop YARN。 Should I stop the existing Hadoop processes and rely solely on HBase or configure HBase with existing YARN setup? 我应该停止现有的Hadoop进程并仅依靠HBase还是使用现有的YARN设置配置HBase？ I can share more information if needed. 如果需要，我可以分享更多信息。 I am trying to run HBase first on pseudo-distributed and then in distributed mode. 我试图先在伪分布式上运行HBase，然后再在分布式模式下运行。

To be clear - I am asking whether I need Hadoop YARN running before I install HBase (in a distributed manner - not on a single computer). 明确地说-我问我是否需要在安装HBase之前运行Hadoop YARN（以分布式方式-不在单台计算机上）。 If not, and I still have Hadoop YARN on those computers, will it cause any issues for HBase to run on those servers. 如果没有，并且我在那些计算机上仍然有Hadoop YARN，它将导致HBase在这些服务器上运行时出现任何问题。

1 个解决方案

This question is a bit confusing. 这个问题有点令人困惑。 But the point is HBase & YARN are not dependent on each other. 但是关键是HBase和YARN彼此不依赖。

You can very well stop the YARN service and use HBase. 您可以很好地停止YARN服务并使用HBase。 The only services HBase will use from your existing cluster are HDFS & Zookeeper. HBase将在现有群集中使用的唯一服务是HDFS和Zookeeper。

Some places people use MapReduce to aggregate data for HBase tables. 在某些地方，人们使用MapReduce来聚合HBase表的数据。 And in that case both can coexist. 在这种情况下，两者可以共存。

If your use case is real time data access/update with high throughput then i would recommend not running YARN along with HBase. 如果您的用例是具有高吞吐量的实时数据访问/更新，那么我建议不要与HBase一起运行YARN。

Hope this was what you were looking for :) 希望这就是您想要的:)