简体   繁体   中英

Proper way to clear a buffer in a different thread

I am faced with this problem.
I have a processor class Processor that receives data from a client app, does a bunch of calculations and insert the result into a database.
I split the insert to DB to a different class DataLayer
To avoid blocking the thread, I made the DataLayer class doesn't insert into the database right away but instead add the results to a list List<Result> buffer .
Every hour, the DataLayer would take the buffer insert to the database and clear the buffer
Now I am sure I could run into a race condition when clearing the buffer.
What is the right way to do this to avoid any concurrency issues

code snippet:

public class Processor{
  
  private DataLayer dataLayer = new DataLayer();

  void accept(List<Data> data){
     //receive data from client app
     List<Result> results = calculateResults(data);
     saveResults(results)
  }

  List<Result> calculateResults(List<Data> data){

  void saveResults(List<Result> results){
     dataLayer.insertToDB(results);
  }

 }


public class DataLayer {

   private ThreadPoolTaskScheduler taskScheduler;

   private List<Result> buffer = new ArrayList<>();

   public DataLayer(){
      scheduleHourlyCheck();
   }

   void insertToDB(List<Result> results){
      this.buffer.add(results)
   }

   scheduleHourlyCheck(){
     taskScheduler.schedule(() -> {
       jdbcTemplate.update(...)
       buffer.clear()   // ------> i am sure this shouldn't be done
     }, everyHour)
   }
}

My issue is this: buffer.clear()

I am sure this could cause concurrency issues
what is the proper way to handle this?

The smallest change you need to make is wrap ArrayList with Collections.synchronizedList :

private List<String> buffer = Collections.synchronizedList(new ArrayList<>());

This would be required even if you didn't periodically call clear .

This avoids two threads from accessing the list at the same time, whether they are both adding items or one is adding items and the other is clearing the list.

If you iterate on the list within jdbcTemplate.update(...) - that still needs to be synchronized externally:

synchronized (buffer) {
    jdbcTemplate.update(...);
}

The main problem is that some inserts could be ignored. When you update the database with the current list ( jdbcTemplate.update(...) ) and clear it right after ( buffer.clear() ), any insert in between would be ignored because it would be instantly cleared away.

One way to solve that is to simply mark involved sections as synchronized :

void insertToDB(List<Result> results){
    synchronized (buffer) {
        this.buffer.add(results)
    }
}
scheduleHourlyCheck(){
    taskScheduler.schedule(() -> {
        synchronized (buffer) {
            jdbcTemplate.update(...);
            buffer.clear();
        }
    }, everyHour)
}

If you want to avoid that the adders have to wait for all database-updates to finish (and you certainly do) then you can simply copy the list for the updates.

scheduleHourlyCheck(){
    taskScheduler.schedule(() -> {
        List<Result> temp; 
        synchronized (buffer) {
            temp = new ArrayList<>(buffer);
            buffer.clear();
        }
        jdbcTemplate.update(...); // but use temp here
    }, everyHour)
}

Now you've exactly what you've stated in the question: A synchronized mechanism that periodically updates your database.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM