简体   繁体   English

在reduce阶段工作时如何开始map阶段

[英]How to start the map phase while the reduce phase is working

I have this scenario. 我有这种情况。 JobA and JobB. JobA和JobB。 Is there a chance to start the JobB mapping phase using the data provided by the JobA reduce phase while this is still working? 在仍然有效的情况下,是否有机会使用JobA reduce阶段提供的数据来启动JobB映射阶段?

Thanks! 谢谢!

The only thing that comes in my mind is to have a thread(started in your driver class) which permanently checks the output directory of JobA. 我唯一想到的就是拥有一个线程(从驱动程序类开始),该线程可以永久检查JobA的输出目录。 When a particular(set of) part-r-xxxx file(s) were created and completely written you can start JobB having as input that particular(set of) part-r-xxxx file(s). 创建并完全编写了一部分特定的r-xxxx文件后,您可以启动JobB,并输入该特定部分r-xxxx的文件。

The only problem I can identify now is the one related to checking whether a part-r-xxxx file was completely written. 我现在可以确定的唯一问题是与检查part-r-xxxx文件是否已完全写入有关的问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM