簡體   English   中英

使用MultilineJSON格式的Hadoop 2.2中的Mapper任務出錯

[英]Error in Mapper Task in Hadoop 2.2 using MultilineJSON format

當我在解析MultilineJSONFormat數據時嘗試運行Map任務時,我得到以下錯誤。我有所有必要的JAR並且程序正在編譯而沒有任何錯誤。 輸入如下:

    [
        {
            "SeasonTicket": false, 
            "name": "Vinson Foreman", 
            "gender": "male", 
            "age": 50, 
            "email": "vinsonforeman@cyclonica.com", 
            "annualSalary": "$98,501.00", 
            "id": 0
        }, 
        {
            "SeasonTicket": true, 
            "name": "Genevieve Compton", 
            "gender": "female", 
            "age": 28, 
            "email": "genevievecompton@cyclonica.com", 
            "annualSalary": "$46,881.00", 
            "id": 1
        }
]

我試圖得到性別的數量:男性或女性屬性。 請參閱以下代碼:

Mapper類:

public  class DemoMapper extends Mapper<LongWritable, Text, Text, Text> {
     private Text k = new Text();
      private Text v ;

    @Override  
    protected void map(LongWritable key , Text value, Context context)
            throws IOException, InterruptedException {
        String line = value.toString();
        StringTokenizer itr = new StringTokenizer(line);
            while (itr.hasMoreTokens()) {
                  //String token = itr.nextToken();
                k.set((itr.nextToken()));
                    context.write(k, v);
            }
    }
}   

減速機類:

public class DemoReducer extends Reducer<Text, IntWritable, Text, IntWritable>

{
    //@Override
    public void reduce(Text key, Iterable <IntWritable> values,
            Context context) throws IOException, InterruptedException {

                int sum = 0;
                while ((Iterable) values.iterator() != null) {

                    IntWritable value = values.iterator().next();
                        sum += value.get(); // process value*/
                }

            context.write(key, new IntWritable(sum));
          }
}

主要課程:

public final class ExampleJob extends Configured implements Tool {

    public static void main(final String[] args) throws Exception {
        int res = ToolRunner.run(new Configuration(), new ExampleJob(), args);
        System.exit(res);
    }

     /**
     * The MapReduce driver - setup and launch the job.
     *
     * @param args the command-line arguments
     * @return the process exit code
     * @throws Exception if something goes wrong
     */
    public int run(final String[] args) throws Exception {

        Configuration conf = super.getConf();

      //  writeInput(conf, new Path(input));

        Job job = new Job(conf);
        job.setJarByClass(ExampleJob.class);
        job.setOutputKeyClass(LongWritable.class);
        job.setOutputValueClass(Text.class);
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(LongWritable.class);

        job.setMapperClass(DemoMapper.class);
        job.setReducerClass(DemoReducer.class);
        job.setCombinerClass(DemoReducer.class);
       // job.setNumReduceTasks(1);


        Path path = new Path("result15");

        FileInputFormat.addInputPaths(job, "testfolder");
        FileOutputFormat.setOutputPath(job, path);

        // use the JSON input format
        job.setInputFormatClass(MultiLineJsonInputFormat.class);

        // specify the JSON attribute name which is used to determine which
        // JSON elements are supplied to the mapper
        MultiLineJsonInputFormat.setInputJsonMember(job,"gender");

        if (job.waitForCompletion(true)) {
            return 0;
        }
        return 1; 
       }
}

堆棧跟蹤:

    SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
2014-04-06 18:30:33,662 WARN  [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2014-04-06 18:30:33,878 INFO  [main] jvm.JvmMetrics (JvmMetrics.java:init(76)) - Initializing JVM Metrics with processName=JobTracker, sessionId=
2014-04-06 18:30:34,352 WARN  [main] mapreduce.JobSubmitter (JobSubmitter.java:copyAndConfigureFiles(258)) - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2014-04-06 18:30:34,379 INFO  [main] input.FileInputFormat (FileInputFormat.java:listStatus(287)) - Total input paths to process : 1
2014-04-06 18:30:34,459 INFO  [main] mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(394)) - number of splits:1
2014-04-06 18:30:34,482 INFO  [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - user.name is deprecated. Instead, use mapreduce.job.user.name
2014-04-06 18:30:34,484 INFO  [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
2014-04-06 18:30:34,485 INFO  [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
2014-04-06 18:30:34,486 INFO  [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapreduce.combine.class is deprecated. Instead, use mapreduce.job.combine.class
2014-04-06 18:30:34,487 INFO  [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
2014-04-06 18:30:34,487 INFO  [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
2014-04-06 18:30:34,488 INFO  [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
2014-04-06 18:30:34,488 INFO  [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
2014-04-06 18:30:34,489 INFO  [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
2014-04-06 18:30:34,489 INFO  [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2014-04-06 18:30:34,490 INFO  [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
2014-04-06 18:30:34,495 INFO  [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
2014-04-06 18:30:34,496 INFO  [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(840)) - mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
2014-04-06 18:30:34,881 INFO  [main] mapreduce.JobSubmitter (JobSubmitter.java:printTokens(477)) - Submitting tokens for job: job_local1580542852_0001
2014-04-06 18:30:35,005 WARN  [main] conf.Configuration (Configuration.java:loadProperty(2172)) - file:/tmp/hadoop-riak/mapred/staging/riak1580542852/.staging/job_local1580542852_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
2014-04-06 18:30:35,006 WARN  [main] conf.Configuration (Configuration.java:loadProperty(2172)) - file:/tmp/hadoop-riak/mapred/staging/riak1580542852/.staging/job_local1580542852_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
2014-04-06 18:30:35,412 WARN  [main] conf.Configuration (Configuration.java:loadProperty(2172)) - file:/tmp/hadoop-riak/mapred/local/localRunner/riak/job_local1580542852_0001/job_local1580542852_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
2014-04-06 18:30:35,413 WARN  [main] conf.Configuration (Configuration.java:loadProperty(2172)) - file:/tmp/hadoop-riak/mapred/local/localRunner/riak/job_local1580542852_0001/job_local1580542852_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
2014-04-06 18:30:35,437 INFO  [main] mapreduce.Job (Job.java:submit(1272)) - The url to track the job: http://localhost:8080/
2014-04-06 18:30:35,439 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1317)) - Running job: job_local1580542852_0001
2014-04-06 18:30:35,441 INFO  [Thread-12] mapred.LocalJobRunner (LocalJobRunner.java:createOutputCommitter(323)) - OutputCommitter set in config null
2014-04-06 18:30:35,453 INFO  [Thread-12] mapred.LocalJobRunner (LocalJobRunner.java:createOutputCommitter(341)) - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
2014-04-06 18:30:35,543 INFO  [Thread-12] mapred.LocalJobRunner (LocalJobRunner.java:run(389)) - Waiting for map tasks
2014-04-06 18:30:35,545 INFO  [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner (LocalJobRunner.java:run(216)) - Starting task: attempt_local1580542852_0001_m_000000_0
2014-04-06 18:30:35,689 INFO  [LocalJobRunner Map Task Executor #0] mapred.Task (Task.java:initialize(581)) -  Using ResourceCalculatorProcessTree : [ ]
2014-04-06 18:30:35,700 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:runNewMapper(732)) - Processing split: file:/home/riak/workspace/Hadooprun/testfolder/file1.json:0+7703579
2014-04-06 18:30:35,733 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:createSortingCollector(387)) - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2014-04-06 18:30:36,585 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1338)) - Job job_local1580542852_0001 running in uber mode : false
2014-04-06 18:30:36,588 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1345)) -  map 0% reduce 0%
2014-04-06 18:30:36,593 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:setEquator(1183)) - (EQUATOR) 0 kvi 26214396(104857584)
2014-04-06 18:30:36,593 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:init(975)) - mapreduce.task.io.sort.mb: 100
2014-04-06 18:30:36,594 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:init(976)) - soft limit at 83886080
2014-04-06 18:30:36,594 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:init(977)) - bufstart = 0; bufvoid = 104857600
2014-04-06 18:30:36,594 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:init(978)) - kvstart = 26214396; length = 6553600
2014-04-06 18:30:36,622 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:flush(1440)) - Starting flush of map output
2014-04-06 18:30:36,649 INFO  [Thread-12] mapred.LocalJobRunner (LocalJobRunner.java:run(397)) - Map task executor complete.
2014-04-06 18:30:36,652 WARN  [Thread-12] mapred.LocalJobRunner (LocalJobRunner.java:run(482)) - job_local1580542852_0001
java.lang.Exception: java.lang.NullPointerException
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:403)
Caused by: java.lang.NullPointerException
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1054)
    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
    at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
    at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
    at main.java.com.alexholmes.json.mapreduce.DemoMapper.map(DemoMapper.java:25)
    at main.java.com.alexholmes.json.mapreduce.DemoMapper.map(DemoMapper.java:1)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
2014-04-06 18:30:37,594 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1358)) - Job job_local1580542852_0001 failed with state FAILED due to: NA
2014-04-06 18:30:37,605 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1363)) - Counters: 0

查看堆棧跟蹤

Caused by: java.lang.NullPointerException
...
at main.java.com.alexholmes.json.mapreduce.DemoMapper.map(DemoMapper.java:25)

在映射器中,成員“Text v”從未初始化,但已寫入上下文。

private Text v ;
...
context.write(k, v);

您需要將“v”初始化為新文本()

除了aasoj的答案之外,我想在這里指出,映射器的輸出將作為輸入饋送到reducer。 因此,在reducer類中,輸入鍵值類型為“Text”和“IntWritable”,其中mapper類的輸出鍵值為“Text”和“Text”。

嘗試更改reducer的輸入鍵值,與mapper的輸出類型相同,如下所示:

公共類DemoMapper擴展了Mapper

公共類DemoReducer擴展了Reducer

除了以上所有東西都找我。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM