简体   繁体   中英

How to skip lines with ItemReader in Spring-Batch?

I have a custom item reader that transforms lines from a textfile to my entity:

public class EntityItemReader extends AbstractItemStreamItemReader<MyEntity> {
    @Override
    public MyEntity read() {
       String line = delegate.read();
       //analyze line and skip by condition
       //line.split
       //create entity with line values
    }
}

This is similar to the FlatFileItemReader .

The read MyEntity will then be persisted to a DB by a JdbcItemReader .

Problem: sometimes I have lines that contain values that should be skipped.

BUT when I just return null inside the read() method of the reader, then not only this item is skipped, by the reading is terminated completely, and all further lines will be skipped. Because a null element is the "signal" for all spring-readers that the file to be read is finished.

So: what can I do to skip specific lines by condition inside the reader if I cannot return null? Because by nature of the reader I'm forced to return an object here.

I think the good practice to filter some lines is to use not the reader but a processor (in which you can return null when you want to filter the line).

Please see http://docs.spring.io/spring-batch/trunk/reference/html/readersAndWriters.html :

6.3.2 Filtering Records

One typical use for an item processor is to filter out records before they are passed to the ItemWriter. Filtering is an action distinct from skipping; skipping indicates that a record is invalid whereas filtering simply indicates that a record should not be written.

For example, consider a batch job that reads a file containing three different types of records: records to insert, records to update, and records to delete. If record deletion is not supported by the system, then we would not want to send any "delete" records to the ItemWriter. But, since these records are not actually bad records, we would want to filter them out, rather than skip. As a result, the ItemWriter would receive only "insert" and "update" records.

To filter a record, one simply returns "null" from the ItemProcessor. The framework will detect that the result is "null" and avoid adding that item to the list of records delivered to the ItemWriter. As usual, an exception thrown from the ItemProcessor will result in a skip.

I've had a similar problem for the more general case where I'm using a custom reader. That is backed by an iterator over an object type and returns a new item (of different type) for each object read. Problem is some of those objects don't map to anything, so I'd like to return something that marks that.

Eventually I've decided to define an INVALID_ITEM and return that. Another approach could be to advance the iterator in the read() method, until the next valid item, with null returned if .hasNext() becomes false, but that is more cumbersome.

Initially I have also tried to return a custom ecxeption and tell Spring to skip the item upon it, but it seemed to be ignored, so I gave up (if there are too many invalids isn't performant anyway).

I do not think you can have your cake and eat it too in this case (and after reading all the comments). By best opinion would (as suggested) to throw a custom Exception and skip 'on it'. You can maybe optimize your entity creation or processes elsewhere so you don't loose so much performance. Good luck.

We can handle it via a custom Dummy Object.

private final MyClass DUMMYMyClassObject ;

private MyClass(){
   // create blank Object .
}

public static final MyClass getDummyyClassObject(){
  if(null == DUMMYMyClassObject){
      DUMMYMyClassObject = new MyClass();
  }
  return DUMMYMyClassObject ;
 }

And just use the below when required to skip the record in the reader :

return MyClass.getDummyyClassObject();

The same can be ignored in the processor , checking if the object is blank OR as per the logic written in the private default constructor .

For skipping lines you can throw Exception when you want to skip some lines, like below.

My Spring batch Step

@Bean
Step processStep() {

    return stepBuilderFactory.get("job step")
                .<String, String>chunk(1000)
                .reader(ItemReader)
                .writer(DataWriter)
                .faultTolerant() //allowing spring batch to skip line 
                .skipLimit(1000) //skip line limit
                .skip(CustomException.class) //skip lines when this exception is thrown
                .build();
}

My Item reader

@Bean(name = "reader")
public FlatFileItemReader<String> fileItemReader() throws Exception {

    FlatFileItemReader<String> reader = new FlatFileItemReader<String>();

    reader.setResource(resourceLoader.getResource("c://file_location/file.txt"));
    CustomLineMapper lineMapper = new CustomLineMapper();
    reader.setLineMapper(lineMapper);

    return reader;

    }

My custom line mapper

public class CustomLineMapper implements LineMapper<String> {

@Override
public String mapLine(String s, int i) throws Exception {

    if(Condition) //put your condition here when you want to skip lines
        throw new CustomException();
    return s;
}
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM