简体繁体中英

Hadoop Processing time in clustered and standalone system

原文 2013-02-21 19:04:33 2 1 ubuntu/ hadoop/ hbase/ distributed-computing

I have set up a 3 node hadoop cluster (1 Namenode, 2 data nodes) and hbase on top of the same hdfs. Each node are 512 MB Ubuntu Virtual box images running on my windows 8 Machine(Intel i5,4GB RAM, 2.4Ghz)
I have configured hbase-hadoop based on this blog http://ankitasblogger.blogspot.in/2011/01/hadoop-cluster-setup.html

I have written a program, which analyzes US Census Data which is approximately has 500,000 records(reduced set). I am just reading the file(from hdfs) in MAP task and storing it is HBASE . and later retrieving data based on a filter.

When I run the program in a stand alone(512 MB Virtual Machine) hadoop-hbase, it takes around 23 minutes. But when I run the same jar in the cluster(512*3 MB) it takes upwards of 40 minutes.

Why is the cluster taking more time to process? or is it a expected result ?

1 answers

running a cluster in virtual-machines will only slow down your map-reduce (because of the overhead from running the virtual-os and multiple hadoop instances) especially if you run out of memory and it has to use the swap from the host os.

keep in mind that the virtual-machines all share 1 physical CPU and should only be used for development.

how to configure ssh for hadoop standalone installation?

How can I start hadoop at boot time?

Memcached in standalone Docker container time out and port error

Change “system” font size in processing IDE (PDE) on Ubuntu

C - System(“”); Execute one at a time

how to import joda time in java on Ubuntu system

How to Run GLSL Program Based on System Time

Get system time not accurate by using chrono library

Using the system's time zone settings in PHP

Changing ubuntu system time using java code

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question how to configure ssh for hadoop standalone installation? How can I start hadoop at boot time? Memcached in standalone Docker container time out and port error Change “system” font size in processing IDE (PDE) on Ubuntu C - System(“”); Execute one at a time how to import joda time in java on Ubuntu system How to Run GLSL Program Based on System Time Get system time not accurate by using chrono library Using the system's time zone settings in PHP Changing ubuntu system time using java code

Related Tags

Hadoop Processing time in clustered and standalone system

Question

1 answers

solution1 1 2013-02-21 19:19:58

solution1
1 2013-02-21 19:19:58