简体繁体中英

How hadoop works? How a client is connected to hadoop

原文 2014-11-07 05:22:07 2 2 java/ apache/ hadoop/ hdfs

I have basic understanding of hadoop. My question is regarding how a client/developer is connected to hadoop cluster to perform queries

For example, I am a hadoop developer. Hadoop cluster in some remote location. How am I connected to the hadoop cluster to run my java code? Do I have to install hadoop in my laptop also (for which I have to run Linux)?

or, is it OK if I am in the same network as of the Hadoop cluster and simply mount the share in my laptop and put my code into hadoop cluster?

Second question: For running java code, do I have to SSH to any data node and then run the job?

The above two questions are haunting me. I don't have real time experience.

Thank you in advance!

2 answers

To open a file, a client contacts the NameNode and retrieves a list of locations for the blocks that comprise the file. These locations identify the DataNodes which hold each block. Clients then read file data directly from the DataNode servers, possibly in parallel. The NameNode is not directly involved in this bulk data transfer, keeping its overhead to a minimum.

I think you don't have proper knowledge of hadoop cluster, follow this link you will be fully understand about cluster of hadoop

http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/

As far as i know having hadoop installed in your laptop is not necessary to run your job in some hadoop cluster.you just have to get the remote access to the job tracker and submit the job.

For the second point "is it OK if I am in the same network as of the Hadoop cluster and simply mount the share in my laptop and put my code into hadooop cluster?"

putting your code in the hadoop cluster must be through the right channels ie through master node. In hadoop you have to submit your data and code to master node and its his duty to distribute it to cluster.
For running java code, do I have to SSH to any data node and then run the job? ==> You will have to ssh to the jobtracker not the datanode. Datanodes are the slaves for storing data. Jobtracker is master for alloting jobs in cluster.

How compression works in Hadoop

How this input and output for Hadoop works?

How hadoop command works with javac

How to create and configure Hadoop client script?

How to set hadoop replication in java client by class org.apache.hadoop.conf.Configuration?

How reduce phase works after map phase in hadoop

How to integrate ?Hadoop with Mysql

How to mount Hadoop HDFS

How to define HADOOP classpath?

How to use CombineFileInputFormat in Hadoop?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question How compression works in Hadoop How this input and output for Hadoop works? How hadoop command works with javac How to create and configure Hadoop client script? How to set hadoop replication in java client by class org.apache.hadoop.conf.Configuration? How reduce phase works after map phase in hadoop How to integrate ?Hadoop with Mysql How to mount Hadoop HDFS How to define HADOOP classpath? How to use CombineFileInputFormat in Hadoop?

Related Tags

How hadoop works? How a client is connected to hadoop

Question

2 answers

solution1
0 2014-11-07 05:56:03

solution2
0 ACCPTED 2014-11-07 06:02:21

How hadoop works? How a client is connected to hadoop

Question

2 answers

solution1 0 2014-11-07 05:56:03

solution2 0 ACCPTED 2014-11-07 06:02:21

solution1
0 2014-11-07 05:56:03

solution2
0 ACCPTED 2014-11-07 06:02:21