[英]Null Pointer Exception - Hadoop Mapreduce job
我是Hadoop和Java的初學者,我正在編寫Map,Reduce函數,以根據接近度將一組緯度和經度聚類為一組,並設置一個量級(聚類中緯度,經度對的數量)和一個代表拉特長對(到目前為止,這是遇到的第一個拉特長對。)
這是我的代碼:
package org.myorg;
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import util.hashing.*;
public class LatLong {
public static class Map extends Mapper<Object, Text, Text, Text> {
//private final static IntWritable one = new IntWritable(1);
public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString();
String[] longLatArray = line.split(",");
double longi = Double.parseDouble(longLatArray[0]);
double lat = Double.parseDouble(longLatArray[1]);
//List<Double> origLatLong = new ArrayList<Double>(2);
//origLatLong.add(lat);
//origLatLong.add(longi);
Geohash inst = Geohash.getInstance();
//encode is the library's encoding function
String hash = inst.encode(lat,longi);
//Using the first 5 characters just for testing purposes
//Need to find the right one later
int accuracy = 4;
//hash of the thing is shortened to whatever I figure out
//to be the right size of each tile
Text shortenedHash = new Text(hash.substring(0,accuracy));
Text origHash = new Text(hash);
context.write(shortenedHash, origHash);
}
}
public static class Reduce extends Reducer<Text, Text, Text, Text> {
private IntWritable totalTileElementCount = new IntWritable();
private Text latlongimag = new Text();
private Text dataSeparator = new Text();
@Override
public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
int elementCount = 0;
boolean first = true;
Iterator<Text> it= values.iterator();
String lat = new String();
String longi = new String();
Geohash inst = Geohash.getInstance();
while (it.hasNext()) {
elementCount = elementCount+1;
if(first)
{
lat = Double.toString((inst.decode(it.toString()))[0]);
longi = Double.toString((inst.decode(it.toString()))[1]);
first = false;
}
@SuppressWarnings("unused")
String blah = it.next().toString();
}
totalTileElementCount.set(elementCount);
//Geohash inst = Geohash.getInstance();
String mag = totalTileElementCount.toString();
latlongimag.set(lat+","+ longi +","+mag+",");
dataSeparator.set("");
context.write(latlongimag, dataSeparator );
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf, "wordcount");
job.setJarByClass(LatLong.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}
}
我正在獲得NPE。 我不知道該如何測試,也無法在代碼中找到錯誤。
Hadoop錯誤:
java.lang.NullPointerException
at util.hashing.Geohash.decode(Geohash.java:41)
at org.myorg.LatLong$Reduce.reduce(LatLong.java:67)
at org.myorg.LatLong$Reduce.reduce(LatLong.java:1)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:663)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:426)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Geohash庫中的解碼函數返回一個雙精度數組。 任何指針將不勝感激! 謝謝你的時間!
EDIT1(經過測試):
我已經意識到問題在於,在reduce函數中不僅需要it.toString(),而且還需要it.next()。toString(),但是當我進行此更改並進行測試時,我得到了此錯誤,當我在while循環條件下檢查hasnext()時,我不知道為什么會出現此錯誤。
java.util.NoSuchElementException: iterate past last value
at org.apache.hadoop.mapreduce.ReduceContext$ValueIterator.next(ReduceContext.java:159)
at org.myorg.LatLong$Reduce.reduce(LatLong.java:69)
at org.myorg.LatLong$Reduce.reduce(LatLong.java:1)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:663)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:426)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
EDIT2(進一步測試):解決方案
我不止一次地調用它.next(),並且作為一個迭代器,這只會導致它繼續執行兩次,在上一次迭代中,它會檢查條件並輸入,但隨后我將其調用。 ()兩次,這會導致問題,因為下一個元素只有一個(最后一個)。
您仍然在it
調用toString
,而不是it.next()
,因此您應該更改
lat = Double.toString((inst.decode(it.toString()))[0]);
longi = Double.toString((inst.decode(it.toString()))[1]);
成
String cords = it.next().toString();
lat = Double.toString((inst.decode(cords))[0]);
longi = Double.toString((inst.decode(cords))[1]);
不要讓inst.decode(it.next().toString())
因為它會調用it.next()
兩個一倍while
迭代。
之后不要調用String blah = it.next().toString();
因為您將獲得java.util.NoSuchElementException: iterate past last value
,原因與上述相同。
而且當您刪除String blah = it.next().toString();
請記住,在first = false
情況下,您永遠不會輸入if(first)
,也永遠不會調用String cords = it.next().toString();
因此it.hasNext()
將始終返回true
並且您將永遠不會退出while
循環,因此請添加適當的條件語句。
這意味着您的“ it”為空,或者解碼后為空。 對它們進行空檢查。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.