[英]Error while Chaining Map Reduce Jobs
My Map Reduce Structure 我的地图减少了结构
public class ChainingMapReduce {
public static class ChainingMapReduceMapper
extends Mapper<Object, Text, Text, IntWritable>{
public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
// code
}
}
}
public static class ChainingMapReduceReducer
extends Reducer<Text,IntWritable,Text,IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values,
Context context
) throws IOException, InterruptedException {
//code
}
}
public static class ChainingMapReduceMapper1
extends Mapper<Object, Text, Text, IntWritable>{
public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
//code
}
}
}
public static class ChainingMapReduceReducer1
extends Reducer<Text,IntWritable,Text,IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values,
Context context
) throws IOException, InterruptedException {
//code
}
}
public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {
Configuration conf = new Configuration();
Job job = new Job(conf, "First");
job.setJarByClass(ChainingMapReduce.class);
job.setMapperClass(ChainingMapReduceMapper.class);
job.setCombinerClass(ChainingMapReduceReducer.class);
job.setReducerClass(ChainingMapReduceReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path("/home/Desktop/log"));
FileOutputFormat.setOutputPath(job, new Path("/home/Desktop/temp/output"));
job.waitForCompletion( true );
System.out.println("First Job Completed.....Starting Second Job");
System.out.println(job.isSuccessful());
/* FileSystem hdfs = FileSystem.get(conf);
Path fromPath = new Path("/home/Desktop/temp/output/part-r-00000");
Path toPath = new Path("/home/Desktop/temp/output1");
hdfs.rename(fromPath, toPath);
conf.clear();
*/
if(job.isSuccessful()){
Configuration conf1 = new Configuration();
Job job1 = new Job(conf1,"Second");
job1.setJarByClass(ChainingMapReduce.class);
job1.setMapperClass(ChainingMapReduceMapper1.class);
job1.setCombinerClass(ChainingMapReduceReducer1.class);
job1.setReducerClass(ChainingMapReduceReducer1.class);
job1.setOutputKeyClass(Text.class);
job1.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path("/home/Desktop/temp/output/part-r-00000)");
FileOutputFormat.setOutputPath(job, new Path("/home/Desktop/temp/output1"));
System.exit(job1.waitForCompletion(true) ? 0 : 1);
}
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
While I run this Program ...First Job get executed perfectly and after that following error come : 当我运行这个程序时...第一个工作得到完美执行,之后出现以下错误:
First Job Completed.....Starting Second Job true
第一份工作完成.....开始第二份工作是真的
12/01/27 15:24:21 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 12/01/27 15:24:21 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments.
12/01/27 15:24:21 INFO jvm.JvmMetrics:无法使用processName = JobTracker初始化JVM指标,sessionId = - 已经初始化12/01/27 15:24:21 WARN mapred.JobClient:使用GenericOptionsParser解析参数。 Applications should implement Tool for the same.
应用程序应该实现相同的工具。 12/01/27 15:24:21 WARN mapred.JobClient: No job jar file set.
12/01/27 15:24:21 WARN mapred.JobClient:没有工作jar文件集。 User classes may not be found.
可能找不到用户类。 See JobConf(Class) or JobConf#setJar(String).
请参阅JobConf(Class)或JobConf#setJar(String)。 12/01/27 15:24:21 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop/mapred/staging/4991311720439552/.staging/job_local_0002 Exception in thread "main" org.apache.hadoop.mapred.InvalidJobConfException: Output directory not set.
12/01/27 15:24:21 INFO mapred.JobClient:清理临时区域文件:/tmp/hadoop/mapred/staging/4991311720439552/.staging/job_local_0002线程“main”中的异常org.apache.hadoop.mapred .InvalidJobConfException:未设置输出目录。 at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:123) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:872) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) at org.apache.hadoop.mapreduce.Job.submit(Job.java:476) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506) at ChainingMapReduce.main(ChainingMapReduce.java:129)
在Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:123)org.apache.hadoop.mapred.JobClient $ 2.run(JobClient.java:872)org.apache.hadoop。 mapred.JobClient $ 2.run(JobClient.java:833)位于org.apache.hadoop.security的javax.security.auth.Subject.doAs(Subject.java:396)的java.security.AccessController.doPrivileged(Native Method) .UserGroupInformation.doAs(UserGroupInformation.java:1127)org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)at org.apache.hadoop.mapreduce.Job.submit(Job.java:476)at at ChainingMapReduce.main上的org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506)(ChainingMapReduce.java:129)
I tried to use "conf" for both jobs and "conf" "conf1" for respective jobs. 我尝试对两个工作使用“conf”,并为各自的工作使用“conf”“conf1”。
Change 更改
FileInputFormat.addInputPath(job, new Path("/home/Desktop/temp/output/part-r-00000)");
FileOutputFormat.setOutputPath(job, new Path("/home/Desktop/temp/output1"));
to 至
FileInputFormat.addInputPath(job1, new Path("/home/Desktop/temp/output/part-r-00000)");
FileOutputFormat.setOutputPath(job1, new Path("/home/Desktop/temp/output1"));
for the second job. 第二份工作。
Also consider using oahmapred.jobcontrol.Job and Apache Oozie . 还可以考虑使用oahmapred.jobcontrol.Job和Apache Oozie 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.