![](/img/trans.png)
[英]Does Spring Batch release the heap memory after processing each batch?
[英]Spring batch does not continue processing records after xml reading error
我配置了spring批处理,以在读取xml文件时出错时跳过不良记录。 skipPolicy实现始终返回true,以跳过错误的记录。 该工作需要继续处理其余记录,但是在我的情况下,它会在不良记录完成后停止。
@Configuration
@Import(DataSourceConfig.class)
@EnableWebMvc
@ComponentScan(basePackages = "org.nova.batch")
@EnableBatchProcessing
public class BatchIssueConfiguration {
private static final Logger LOG =LoggerFactory.getLogger(BatchIssueConfiguration.class);
@Autowired
private JobBuilderFactory jobBuilderFactory;
@Autowired
private StepBuilderFactory stepBuilderFactory;
@Bean(name = "jobRepository")
public JobRepository jobRepository(DataSource dataSource, PlatformTransactionManager transactionManager) throws Exception {
JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
factory.setDatabaseType("derby");
factory.setDataSource(dataSource);
factory.setTransactionManager(transactionManager);
return factory.getObject();
}
@Bean
public Step stepSGR() throws IOException{
return stepBuilderFactory.get("ETL_STEP").<SigmodRecord.Issue,SigmodRecord.Issue>chunk(1)
//.processor(itemProcessor())
.writer(itemWriter())
.reader(multiReader())
.faultTolerant()
.skipLimit(Integer.MAX_VALUE)
.skipPolicy(new FileVerificationSkipper())
.skip(Throwable.class)
.build();
}
@Bean
public SkipPolicy fileVerificationSkipper(){
return new FileVerificationSkipper();
}
@Bean
@JobScope
public MultiResourceItemReader<SigmodRecord.Issue> multiReader() throws IOException{
MultiResourceItemReader<SigmodRecord.Issue> mrir = new MultiResourceItemReader<SigmodRecord.Issue>();
//FileSystemResource [] files = new FileSystemResource [{}];
ResourcePatternResolver rpr = new PathMatchingResourcePatternResolver();
Resource[] resources = rpr.getResources("file:c:/temp/Sigm*.xml");
mrir.setResources( resources);
mrir.setDelegate(xmlItemReader());
return mrir;
}
}
public class FileVerificationSkipper implements SkipPolicy {
private static final Logger LOG = LoggerFactory.getLogger(FileVerificationSkipper.class);
@Override
public boolean shouldSkip(Throwable t, int skipCount) throws SkipLimitExceededException {
LOG.error("There is an error {}",t);
return true;
}
}
该文件具有包含导致读取错误的“&”的输入,即
<title>Notes of DDTS & n Apparatus for Experimental Research</title>
这将引发以下错误:
org.springframework.dao.DataAccessResourceFailureException: Error reading XML stream; nested exception is javax.xml.stream.XMLStreamException: ParseError at [row,col]:[127,25]
Message: The entity name must immediately follow the '&' in the entity reference.
我在配置中是否做错了什么,不允许其余的记录继续处理。
要跳过某些类型的异常,我们可以提及跳过策略,在该策略中我们可以编写用于跳过异常的自定义逻辑。 像下面的代码。
@Bean
public Step stepSGR() throws IOException{
return stepBuilderFactory.get("ETL_STEP").<SigmodRecord.Issue,SigmodRecord.Issue>chunk(1)
//.processor(itemProcessor())
.writer(itemWriter())
.reader(multiReader())
.faultTolerant()
.skipPolicy(new FileVerificationSkipper())
.build();
}
public class FileVerificationSkipper implements SkipPolicy {
private static final Logger LOG = LoggerFactory.getLogger(FileVerificationSkipper.class);
@Override
public boolean shouldSkip(Throwable t, int skipCount) throws SkipLimitExceededException {
LOG.error("There is an error {}",t);
if (t instanceof DataAccessResourceFailureException)
return true;
}
}
或者,您可以像下面这样简单地进行设置。
@Bean
public Step stepSGR() throws IOException{
return stepBuilderFactory.get("ETL_STEP").<SigmodRecord.Issue,SigmodRecord.Issue>chunk(1)
//.processor(itemProcessor())
.writer(itemWriter())
.reader(multiReader())
.faultTolerant()
.skipLimit(Integer.MAX_VALUE)
.skip(DataAccessResourceFailureException.class)
.build();
}
这个问题属于格式错误的xml,看来除了修复xml本身之外,没有其他方法可以恢复。 春天的StaxEventItemReader在xml的低层解析中使用XMLEventReader,因此我尝试使用XMLEventReader读取xml文件以尝试跳过坏块,但是XMLEventReader.nextEvent()始终在坏块所在的位置抛出异常。 我试图在try catch块中处理该事件,以跳到下一个事件,但是看来读者不会移动到下一个事件。 因此,目前解决此问题的唯一方法是在处理XML之前先对其进行修复。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.