简体   繁体   中英

Copying a mapfile to another using mapreduce

I am writing a program to copy a map file to another location in HDFS .

Here is my code, The main class.

String uri = "hdfs://<ip>:8020/poc/input2";
String uri2 = "hdfs://<ip>:8020/poc/output1";
String uribyMapR = "hdfs://<ip>:8020/poc/outputbysetOutput";
boolean b = false;
Configuration conf = new Configuration();
conf.addResource(new Path("/hadoop/core-site.xml"));
conf.addResource(new Path("/hadoop/hdfs-site.xml"));
FileSystem filesystem = FileSystem.get(conf);
Path inputpath = new Path(uri);
Path outputpath = new Path(uri2);
Path outputbyMapR = new Path(uribyMapR);
conf.set("uri", uri);
conf.set("uri2", uri2);
if (filesystem.exists(outputpath))          
    filesystem.delete(outputpath, true);
if (filesystem.exists(outputbyMapR))            
    filesystem.delete(outputbyMapR, true);  
Job job = new Job(conf, "MapFile");
job.setJarByClass(Main.class);
job.setMapperClass(MapTry.class);
job.setMapOutputKeyClass(LongWritable.class);
job.setMapOutputValueClass(Text.class);
job.setInputFormatClass(SequenceFileInputFormat.class);
SequenceFileInputFormat.addInputPath(job, inputpath);
MapFileOutputFormat.setOutputPath(job, outputbyMapR);
try {
    b = job.waitForCompletion(true);
} catch (IOException e) {
    e.printStackTrace();
} catch (InterruptedException e) {
    e.printStackTrace();
} catch (ClassNotFoundException e) {
    e.printStackTrace();
}
if (!b) {
    throw new IOException("The job is failed");
}

The mapper class:

public class MapTry extends Mapper<LongWritable, Text, LongWritable, Text>{
MapFile.Reader reader = null;
MapFile.Writer writer= null;
@Override
protected void setup(Context context)throws IOException, InterruptedException {
    Configuration conf = context.getConfiguration();
    String uri = context.getConfiguration().get("uri");
    String uri2 = context.getConfiguration().get("uri2");
    FileSystem fs = FileSystem.get(conf);
    reader = new Reader(fs, uri, conf);
    LongWritable key = new LongWritable(1);
    Text value = new Text();
    writer = new MapFile.Writer(conf, fs, uri2, key.getClass(), value.getClass());
}

@Override
protected void map(LongWritable key, Text value, Context context)
        throws IOException, InterruptedException {
    //reader.next(key, value);
    //reader.get(key, value);
    System.out.println(key.toString() + " " + value.toString());
    //writer.append(key, value);
    context.write(key, value);
}

@Override
protected void cleanup(Context context) throws IOException, InterruptedException{
    reader.close();
    writer.close();
}

The MapFile contains 2 files, data and index. The issuie is when it writes the data file it is successful but when it reaches the index file it gives the following exception:

java.lang.Exception: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.io.LongWritable
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:404)
Caused by: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.io.LongWritable
    at mapreduce.MapTry.map(MapTry.java:1)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
    at java.lang.Thread.run(Thread.java:662)

So, what should I do to write a MapFile using a map reduce program.

Found out that if I just read input as sequence file in data file and output it as MapFile things will work.

SequenceFileInputFormat.addInputPath(job, inputpath);
MapFileOutputFormat.setOutputPath(job, outputpath);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM