简体   繁体   中英

Access HDFS in Remote Cluster

Currently, I have a remote Hadoop cluster. When I try to access data in datanode through namenode, the namenode will redirect me to the datanode. However, the returned domain name of datanode can only be recognized inside that cluster. Furthermore, I cannot revise /etc/hosts in client side.

Can I configure the namenode to redirect me with any IP or domain? Where is the namenode used to record the domain to return?

I believe that what you need is a Gateway server (also called EdgeNode ). There are several tutorial out there.

In your particular case your server holding the namenode will also hold the EdgeNode.

There are two particular projects to achieve this:

  1. Using SOCKS proxy. Using Hadoop through a SOCKS proxy?
  2. Using HTTPFS: https://hadoop.apache.org/docs/r2.4.1/hadoop-hdfs-httpfs/index.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM