简体   繁体   中英

Spring Boot ThreadPoolTaskExecutor memory leak

I have Spring Boot app running on Wildfly 18.0.1. The main purpose of the app is: every 5 minutes run some job. So I make:

TaskScheduler: initialize scheduler

@Autowired
ThreadPoolTaskScheduler taskScheduler;
taskScheduler.scheduleWithFixedDelay(new ScheduledVehicleDataUpdate(), 300000);

ScheduledVehicleDataUpdate: scheduler that runs updater

public class ScheduledVehicleDataUpdate implements Runnable {
    @Autowired
    TaskExecutor taskExecutor;

    @Override
    public void run() {
        try {
            CountDownLatch countDownLatch;
            List<VehicleEntity> vehicleList = VehicleService.getInstance().getList();
            if (vehicleList.size() > 0) {
                countDownLatch = new CountDownLatch(vehiclesList.size());
                vehicleList.forEach(vehicle -> taskExecutor.execute(new VehicleDataUpdater(vehicle, countDownLatch)));
                countDownLatch.await();
            }
        }
        catch (InterruptedException | RuntimeException e) {
            System.out.println(e.getMessage())
        }
    }
}

TaskExecutor:

@Bean
public TaskExecutor taskExecutor() {
    ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
    executor.setCorePoolSize(23);
    executor.setMaxPoolSize(23);
    executor.setQueueCapacity(5000);
    executor.setThreadNamePrefix("VehicleService_updater_thread");
    executor.initialize();
    return executor;
}

VehicleDataUpdater: main updater class

public class VehicleDataUpdater implements Runnable {
    private final VehicleEntity vehicle;
    private final CountDownLatch countDownLatch;

    public VehicleDataUpdater(VehicleEntity vehicle, CountDownLatch countDownLatch) {
        this.vehicle = vehicle;
        this.countDownLatch = countDownLatch;
    }

    @Override
    public void run() {    
        try {
            this.updateVehicleData();
        }
        catch (Exception e) {
            System.out.println(e.getMessage());
        }
        finally {
            countDownLatch.countDown();
        }
    }

    public void updateVehicleData() {
        // DO UPDATE ACTIONS;
    }
}

The issue is that after finish ScheduledVehicleDataUpdate the memory is NOT clearing. It looks like this: 在此处输入图像描述

Every step the memory is growing, growing, growing and at unpredictable moment all memory is freed up. And objects from first iteration, and object from last iteration. In the most bad case it takes all available memory (120Gb) and Wildfly crashes.

I have about 3200 VehicleEntity records (lets assume exactly 3200). So I've looked for the VehicleDataUpdater - how much objects there are in memory. After first iteration (when I only started app) it is less than 3200 but not zero - maybe about 3000-3100. And every step it grows but not exactly on 3200 records. That means that some of objects clears from memory but most of them remains there.

Next: normal duration of iteration is about 30sec - 1min. When memory is not clear up and continue growing then each iteration get more and more time: the longest that I saw was 30 minutes. And threads from pool are mostly in "monitor" state, ie there are some locks waiting to release. Possibly locks from previous iterations that was not released - and question again - why all memory was not freed up at previous step?

If I execute update in one thread (without taskExecutor, simply vehicleList.foreach(vehicle -> VehicleDataUpdater(vehicle)); ) than I didn't see any memory growing. After update each vehicle memory is cleared.

I didn't find any issues with memory leaks for ThreadPoolTaskExecutor or ThreadPoolTaskScheduler, so I have no idea how to fix it.

What possible ways of not clearing memory after finish scheduler task? How can I look who is locking object after finish? I'm using VisualVM 2.0.1 and didn't find there such possibilities.

EDIT 1:

VehicleService:

public class VehicleService {
    private static VehicleService instance = null;
    private VehicleDao dao;

    public static VehicleService getInstance(){
        if (instance == null) {
            instance = new VehicleService();
        }
        return instance;
    }

    private VehicleService(){}

    public void setDao(VehicleDao vehicleDao) { this.dao = vehicleDao; }

    public List<VehicleEntity> list() {
        return new ArrayList<>(this.dao.list(LocalDateTime.now()));
    }
}

VehicleDao:

@Repository
public class VehicleDao {
    @PersistenceContext(unitName = "entityManager")
    private EntityManager entityManager;

    @Transactional("transactionManager")
    public List<VehicleRegisterEntity> list(LocalDateTime dtPeriod) {
        return this.entityManager.createQuery("SOME_QUERY", VehicleEntity.class).getResultList();
    }
}

InitService:

@Service
public class InitHibernateService {
    private final VehicleDao vehicleDao;

    @Autowired
    public InitHibernateService(VehicleDao vehicleDao){
        this.vehicleDao = vehicleDao;
    }

    @PostConstruct
    private void setDao() {
        VehicleService.getInstance().setDao(this.vehicleDao);
    }
}

EntityManager:

@Bean(name = "entityManager")
@DependsOn("dataSource")
public LocalContainerEntityManagerFactoryBean entityManagerFactory() throws NamingException {
    LocalContainerEntityManagerFactoryBean em = new LocalContainerEntityManagerFactoryBean();
    em.setPersistenceProviderClass(HibernatePersistenceProvider.class);
    em.setDataSource(dataSource());
    em.setPackagesToScan("MY_PACKAGE");
    em.setJpaVendorAdapter(vendorAdapter());
    em.setJpaProperties(hibernateProperties());
    em.setPersistenceUnitName("customEntityManager");
    em.setJpaDialect(new CustomHibernateJpaDialect());
    return em;
}

Looking at what you are trying to achieve is basically optimal batch processing when using JPA. However you are trying to use a canon (multi-threading) instead of solving the actual issue. For a nice overview I strongly suggest a read of [this blog post][1].

  1. Use chunk processing and flush the entity manager after x records and then clear. This prevents you from doing a lot of dirty checks in the first level cache
  2. Enable Batch statements on hibernate as well as ordering inserts and updates

First of all start with the properties make sure that your hibernateProperties contains the following

hibernate.jdbc.batch_size=25
hibernate.order_inserts=true
hibernate.order_updates=true

Then rewrite your ScheduledVehicleDataUpdate to take advantage of this and periodically flush/clear the entitymanager.

@Component
public class ScheduledVehicleDataUpdate {
    @PersistenceContext
    private EntityManager em;

    @Scheduled(fixedDelayString="${your-delay-property-here}")
    @Transactional
    public void run() {
        try {
            List<VehicleEntity> vehicleList = getList();
            for (int i = 0 ; i < vehicleList.size() ; i++) {
              updateVehicle(vehicleList.get(i));
              if ( (i % 25) == 0) {
                em.flush();
                em.clear();
              }
            }
        }
    }

    private void updateVehicle(Vehicle vehicle) {
       // Your updates here
    }

    private List<VehicleEntity> getList() {
        return this.entityManager.createQuery("SOME_QUERY", VehicleEntity.class).getResultList();
    }
}

Now you could also decrease the memory consumption of the getList by making that a bit more lazy (ie only retrieve the data when you need it). You can do this by tapping into hibernate and use a stream method (as of Hibernate 5.2) or when using older versions do a bit more work and use a ScrollableResult (see Is there are way to scroll results with JPA/hibernate? ). If you are already on JPA 2.2 (ie Hibernate 5.3) you can use getResultStream directly.

private Stream<VehicleEntity> getList() {
  Query q = this.entityManager.createQuery("SOME_QUERY", VehicleEntity.class);
  org.hibernate.query.Query hq = q.unwrap(org.hibernate.query.Query.class);
  return hq.stream();
}

or with JPA 2.2

private Stream<VehicleEntity> getList() {
  Query q = this.entityManager.createQuery("SOME_QUERY", VehicleEntity.class);
  return q.getResultStream();
}

In your code you would need to change the for loop to work with a stream, and keep a counter yourself and still flush periodically. Using a stream will unlikely improve performance (might even degrade it) but will use less memory then when retrieving all elements at once. As you only have as many objects in-memory as you are using for the batch-size..

@Scheduled(fixedDelayString="${your-delay-property-here}")
    @Transactional
    public void run() {
        try {
            Stream<VehicleEntity> vehicles = getList();
            LongAdder counter = new LongAdder();
            vehicles.forEach(it -> {
              counter.increment();
              updateVehicle(it);
              if ( (counter.longValue() % 25) == 0) {
                em.flush();
                em.clear();
              }
            });
            }
        }
    }

Something like this should do the trick.

NOTE: I typed the code as I went along, this might not compile due to some missing brackets, imports etc.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM