I am trying to solve the following RecordReader problem . Example Input File :
1,1
2,2
3,3
4,4
5,5
6,6
7,7
.......
.......
i want my RecordReader to return
key | Value
0 |1,1:2,2:3,3:4,4:5,5
4 |2,2:3,3:......6,6
6 |3,3:4,4......6,6,7,7
(for first value first five line , for 2nd value five lines starting from 2nd line and for 3rd value five lines starting from 3rd line and so on )
public class MyRecordReader extends RecordReader<LongWritable, Text> {
@Override
public boolean nextKeyValue() throws IOException, InterruptedException {
while (pos < end) {
key.set(pos);
// five line logic
Text nextLine=new Text();
int newSize = in.readLine(value, maxLineLength,
Math.max((int)Math.min(Integer.MAX_VALUE, end-pos),
maxLineLength));
fileSeek+=newSize;
for(int n=0;n<4;n++)
{
fileSeek+=in.readLine(nextLine, maxLineLength,
Math.max((int)Math.min(Integer.MAX_VALUE, end-pos),
maxLineLength));
value.append(":".getBytes(), 0,1);
value.append(nextLine.getBytes(), 0, nextLine.getLength());
}
if (newSize == 0) {
return false;
}
pos += newSize;
if (newSize < maxLineLength) {
return true;
}
// line too long. try again
LOG.info("Skipped line of size " + newSize + " at pos " + (pos - newSize));
}
return false;
}
}
But this is returning the values as
key | Value
0 |1,1:2,2:3,3:4,4:5,5
4 |6,6:7,7.......10,10
6 |11,11:12,12:......14,14
can someone help me with this code or a fresh Code for RecodeReader will do as well ? Requirement of the problem (may help you understand the use case) Thanks
I think I understand the question... here's what I would do: wrap another RecordReader and buffer the keys/values from it into a local queue.
public class MyRecordReader extends RecordReader<LongWritable, Text> {
private static final int BUFFER_SIZE = 5;
private static final String DELIMITER = ":";
private Queue<String> valueBuffer = new LinkedList<String>();
private Queue<Long> keyBuffer = new LinkedList<Long>();
private LongWritable key = new LongWritable();
private Text value = new Text();
private RecordReader<LongWritable, Text> rr;
public MyRecordReader(RecordReader<LongWritable, Text> rr) {
this.rr = rr;
}
@Override
public void close() throws IOException {
rr.close();
}
@Override
public LongWritable getCurrentKey() throws IOException, InterruptedException {
return key;
}
@Override
public Text getCurrentValue() throws IOException, InterruptedException {
return value;
}
@Override
public float getProgress() throws IOException, InterruptedException {
return rr.getProgress();
}
@Override
public void initialize(InputSplit arg0, TaskAttemptContext arg1)
throws IOException, InterruptedException {
rr.initialize(arg0, arg1);
}
@Override
public boolean nextKeyValue() throws IOException, InterruptedException {
if (valueBuffer.isEmpty()) {
while (valueBuffer.size() < BUFFER_SIZE) {
if (rr.nextKeyValue()) {
keyBuffer.add(rr.getCurrentKey().get());
valueBuffer.add(rr.getCurrentValue().toString());
} else {
return false;
}
}
} else {
if (rr.nextKeyValue()) {
keyBuffer.add(rr.getCurrentKey().get());
valueBuffer.add(rr.getCurrentValue().toString());
keyBuffer.remove();
valueBuffer.remove();
} else {
return false;
}
}
key.set(keyBuffer.peek());
value.set(getValue());
return true;
}
private String getValue() {
StringBuilder sb = new StringBuilder();
Iterator<String> iter = valueBuffer.iterator();
while (iter.hasNext()) {
sb.append(iter.next());
if (iter.hasNext()) sb.append(DELIMITER);
}
return sb.toString();
}
}
Then for example, you can have a custom InputFormat that extends TextInputFormat and overrides the createRecordReader
method to call super.createRecordReader
and return that result wrapped in a MyRecordReader
, like this:
public class MyTextInputFormat extends TextInputFormat {
@Override
public RecordReader<LongWritable, Text> createRecordReader(
InputSplit arg0, TaskAttemptContext arg1) {
return new MyRecordReader(super.createRecordReader(arg0, arg1));
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.