简体繁体中英

Are there any use cases where hadoop map-reduce can do better than apache spark?

原文 2015-08-03 12:00:20 6 1 apache-spark/ hadoop/ mapreduce

I agree that iterative and interactive programming paradigms are very good with spark than map-reduce. And I also agree that we can use HDFS or any hadoop data store like HBase as a storage layer for Spark.

Therefore, my question is - Do we have any use cases in real world that can say hadoop MR is better than apache spark on those contexts. Here "Better" is used in terms of performance, throughput, latency . Is hadoop MR is still the good one to do BATCH processing than using spark.

If so, Can any one please tell the advantages of hadoop MR over apache spark ? Please keep the entire scope of discussion with respect to COMPUTATION LAYER .

1 answers

As you said, in iterative and interactive programming, the spark is better than hadoop. But spark has a huge need to the memory, if the memory is not enough, it would throw the OOM exception easily, hadoop can deal the situation very well, because hadoop has a good fault tolerant Mechanism.

Secondly, if Data Tilt happened, spark maybe also collapse. I compare the spark and hadoop on the system robustness, because this would decide the success of job.

Recently I test the spark and hadoop performance use some benchmark, according to the result, the spark performance is not better than hadoop on some load, eg kmeans, pagerank. Maybe the memory is a limitation to spark.

Apache spark map-reduce explanation

Use Spark internally Map-Reduce?

Spark and Map-Reduce together

Why is Spark faster than Hadoop Map Reduce

Map-reduce on more than one keys

Why spark is 100 times faster than Hadoop Map Reduce

How to handle Incremental Update in HDFS hadoop Map-Reduce

Is caching the only advantage of spark over map-reduce?

How to use spark for map-reduce flow to select N columns, top M rows of all csv files under a folder?

How do you use map-reduce in scala dataframes when u have 2 fields and second field should be split?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Apache spark map-reduce explanation Use Spark internally Map-Reduce? Spark and Map-Reduce together Why is Spark faster than Hadoop Map Reduce Map-reduce on more than one keys Why spark is 100 times faster than Hadoop Map Reduce How to handle Incremental Update in HDFS hadoop Map-Reduce Is caching the only advantage of spark over map-reduce? How to use spark for map-reduce flow to select N columns, top M rows of all csv files under a folder? How do you use map-reduce in scala dataframes when u have 2 fields and second field should be split?

Related Tags

Are there any use cases where hadoop map-reduce can do better than apache spark?

Question

1 answers

solution1 0 2015-08-04 08:08:37

solution1
0 2015-08-04 08:08:37