简体   繁体   中英

Multiple input files for each mapper 'type'

I am trying to run a job where each mapper 'type' recieves a different input file. I know there is a way to do this with Java using MultipleInputs class like so:

MultipleInputs.addInputPath(job,new Path(args[0]),TextInputFormat.class,CounterMapper.class);
MultipleInputs.addInputPath(job,new Path(args[1]),TextInputFormat.class,CountertwoMapper.class);

Where CounterMapper.class and CountertwoMapper.class are the respective mapper 'types'.

I am trying to achieve similar functionality with MrJob for Python or any other language that is not Java (please don't ask why!).

This image is similar to what I want to achieve.

Any help is appreciated.

I have found a way in which different mappers can be associated to a sing input path, this doesn't exactly answer your question but hope it helps you. In the link below

Using multiple mapper inputs in one streaming job on hadoop?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM