简体   繁体   English

在Spark中执行mapreduces

[英]Execute mapreduces in Spark

My teacher said that we can execute MapReduces in Spark. 我的老师说我们可以在Spark中执行MapReduces。 But, since Spark is said to be faster that hadoop, this means that is always better to do mapreduces in spark. 但是,由于据说Spark比hadoop更快,这意味着在spark中进行mapreduce总是更好。 So, Hadoop MapReduces become useless. 因此,Hadoop MapReduces变得毫无用处。 Is this correct ? 这个对吗 ?

You can execute map() and reduce() function operations on Spark RDD and DataFrames. 您可以在Spark RDD和DataFrames上执行map()reduce() 函数操作

I think this is what your teacher meant 我想这就是你老师的意思

Also Spark is not faster than Hadoop - it's complimentary to Hadoop, it might be faster than Mapreduce, but given the proper resource allocation, Tez execution can actually be faster than Spark and MapReduce and require less total resources than Spark. 同样,Spark并不比Hadoop快-它是对Hadoop的补充,它可能比Mapreduce快,但是考虑到正确的资源分配,Tez的执行实际上可以比Spark和MapReduce快,并且所需的总资源比Spark少。 Unfortunately, one doesn't just write Tez code in most cases, it's usually actions in Pig or Hive 不幸的是,在大多数情况下,不仅会编写Tez代码,而且通常是Pig或Hive中的操作

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM