简体   繁体   中英

How to connect to the HDFS cluster after installing through Cloudera Manager 5

I just installed a HDFS cluster using Cloudera Manager 5(CM5) with its default settings in three nodes (n1.example.com, n2.example.com and n3.example.com). The virtualization of the nodes were done in Parallels (Mac OSX 10.10.1 Yosemite). I am able to see the HDFS system using "sudo -h hdfs " within any of the nodes.

Now, I am trying to access the HDFS system from my ETL tool that are on the host OS (Mac OSX) using the default ID/Password/port, hdfs/(blank)/8020. But I get "Connection Refused". I've attached the screenshot of the ETL.

So, I've installed the ETL tool (Pentaho Kettle) on the n2 node and tried connecting using localhost from the Server but still not working either with a "Connection Refused" error. When I use command like such as "sudo -u hdfs hadoop fs "-ls" /, it works fine though.

Am I missing anything?

FYI, I've already disabled the firewall in those three nodes since they are actually running in my virtual machine environment as a test.

Thank you!

HDFS连接对话框

It turned out to be an ETL configuration issue. The open source Pentaho data integration tool, Kettle, comes with Apache Hadoop 2.0 as a default plugin and that needed to be replaced to use my CDH distribution.

That is, I needed to modify the file data-integration/plugins/pentaho-big-data-plugin/plugin.properties to have the line replacing the existing one:

active.hadoop.configuration=cdh51

Here is the detail information from Pentaho website: http://wiki.pentaho.com/display/BAD/Configuring+Pentaho+for+your+Hadoop+Distro+and+Version .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM