简体   繁体   English

基于 Spring Boot 的应用程序中 Spark 和 ObjectMapper 的序列化问题

[英]Serialization issue with Spark and ObjectMapper in Spring Boot-based applicatuon

I'm using Spark and there's one of my Spring Boot-based application beans:我正在使用 Spark,并且有一个基于 Spring Boot 的应用程序 bean:

@Component
@RequiredArgsConstructor
public class SomeService implements FlatMapFunction<T, K> {

  private final ObjectMapper mapper;
  
}

ObjectMapper here is the standard one taken from application context.此处的ObjectMapper是从应用程序上下文中获取的标准对象。 The problem is that the app fails with org.apache.spark.SparkException: Task not serializable .问题是应用程序失败并显示org.apache.spark.SparkException: Task not serializable Here's serialization stack:这是序列化堆栈:

Caused by: java.io.NotSerializableException: org.springframework.http.converter.json.SpringHandlerInstantiator
Serialization stack:
- object not serializable (class: org.springframework.http.converter.json.SpringHandlerInstantiator, value: org.springframework.http.converter.json.SpringHandlerInstantiator@6e4912db)
- field (class: com.fasterxml.jackson.databind.cfg.BaseSettings, name: _handlerInstantiator, type: class com.fasterxml.jackson.databind.cfg.HandlerInstantiator)
- object (class com.fasterxml.jackson.databind.cfg.BaseSettings, com.fasterxml.jackson.databind.cfg.BaseSettings@155616d8)
- field (class: com.fasterxml.jackson.databind.cfg.MapperConfig, name: _base, type: class com.fasterxml.jackson.databind.cfg.BaseSettings)
- object (class com.fasterxml.jackson.databind.DeserializationConfig, com.fasterxml.jackson.databind.DeserializationConfig@66e72ca2)
- field (class: com.fasterxml.jackson.databind.ObjectMapper, name: _deserializationConfig, type: class com.fasterxml.jackson.databind.DeserializationConfig)
- object (class com.fasterxml.jackson.databind.ObjectMapper, com.fasterxml.jackson.databind.ObjectMapper@433ef204)
- field (class: com.smth.SomeService, name: mapper, type: class com.fasterxml.jackson.databind.ObjectMapper)

So the problem is about non-serializable SpringHandlerInstantiator .所以问题是关于不可序列化的SpringHandlerInstantiator

This far I work this around by assigning mapper field in constructor manually:到目前为止,我通过在构造函数中手动分配mapper字段来解决这个问题:

public SomeService() {
  this.mapper = new ObjectMapper();
}

Is there a way to somehow solve this properly, ie relying on Spring's DI?有没有办法以某种方式正确解决这个问题,即依赖 Spring 的 DI?

I use Spring Boot 2.6.7 and Spark 2.11.我使用 Spring Boot 2.6.7 和 Spark 2.11。

Try to configure ObjectMapper bean to use a serializable HandlerInstantiator.尝试将 ObjectMapper bean 配置为使用可序列化的 HandlerInstantiator。 In the following example MyHandlerInstantiator is a custom implementation of HandlerInstantiator which is serializable (eg it can be started with minimal patching the code of SpringHandlerInstantiator ).在下面的示例中, MyHandlerInstantiatorHandlerInstantiator的自定义实现,它是可序列化的(例如,它可以通过对SpringHandlerInstantiator的代码进行最少的修补来启动)。 This should allow SomeService class to be serializable as well and compatible for usage in a distributed environment like Apache Spark.这应该允许 SomeService 类也可以序列化并兼容在分布式环境(如 Apache Spark)中使用。

@Configuration
public class MyConfiguration {
  
  @Bean
  public ObjectMapper objectMapper() {
    ObjectMapper mapper = new ObjectMapper();
    mapper.setHandlerInstantiator(new MyHandlerInstantiator());
    return mapper;
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM