![](/img/trans.png)
[英]How to read multiple image files as input from hdfs in map-reduce?
[英]How to read multiple files from multiple directories in Map-Reduce
我想从Map-Reduce程序的多个目录中读取多个文件。 我试图在主方法中给出文件名:
FileInputFormat.setInputPaths(conf,new Path("hdfs://localhost:54310/user/test/"));
FileInputFormat.setInputPaths(conf,new Path("hdfs://localhost:54310/Test/test1/"));
但是它仅从一个文件读取。
读取多个文件该怎么办?
请提出解决方案。
谢谢。
FileInputFormat#setInputPaths
将在覆盖之前设置的输入路径后设置输入路径。 使用FileInputFormat#addInputPath
或FileInputFormat#addInputPaths
添加到现有路径。
Follow the below steps for passsing multiple input files from different direcories.Just driver code changes.Follow the below driver code.
CODE:
public int run(String[] args) throws Exception {
Configuration conf=new Configuration();
Job job=Job.getInstance(conf, "MultipleDirectoryAsInput");
job.setMapperClass(Map1Class.class);
job.setMapperClass(Map2Class.class);
job.setReducerClass(ReducerClass.class);
job.setJarByClass(DriverClass.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(NullWritable.class);
//FileInputFormat.setInputPaths(job, new Path(args[0]));
MultipleInputs.addInputPath(job, new Path(args[0]),TextInputFormat.class,Map1Class.class);
MultipleInputs.addInputPath(job, new Path(args[1]), TextInputFormat.class, Map2Class.class);
FileOutputFormat.setOutputPath(job, new Path(args[2]));
return job.waitForCompletion(true)?0:1;
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.