简体   繁体   中英

Spring Jpa Bulk Insert whole data or Update some fields of entity if already available

Problem: I have a process running which will periodically grab 500 records from outside and insert it to the DB. How can i check efficiently below case using spring JPA,

  1. How to insert data if there is no records (non primary_key column which is unique too)?
  2. How to update only some fields if there is a record?

or

How to do saveOrUpdate on bulk records either on all columns or only selected columns using Spring JPA?

If you look into the source code of Spring Data JPA , you'll find:

@Transactional
public <S extends T> S save(S entity) {

  if (entityInformation.isNew(entity)) {
    em.persist(entity);
    return entity;
  } else {
    return em.merge(entity);
  }
}

the save operation is a mix of add and update. So by using the method save() or saveAll() , you have already achieved the function of saveOrUpdate .

To take more control of the above upsert behavior, you can use @DynamicUpdate annotation on the entity class, which means the update SQL generated by JPA will only access the changed columns.

The above information is not enough to use JPA correctly in your situation. If you choose JPA, you must do database access in an object way. To process the 500 records, the signal of unique must be defined by one of the following conditions:

  • the primary key
  • the unique constraint

I suggest you use the second one since the primary key is used to mark the unique at the database level. And then, with JPA, you should find data in your database that existed in the new 500 records, update their modified columns, use saveAll() to update them. And then handle the left part, build entities and use saveAll() to insert them.

You may have noticed that the compare and update then save operation in memory method I used above is not atomic, and may cause ConstraintViolationException when inserting repeat data, there are two ways to handle this:

  • If the data is not that serious, you can just do the above operations but remain some space for the exception to happen, you can either log the information for a manual fix or just do nothing. But remember to catch the exception, don't make the update operation rollbacked.
  • If you are indeed serious about the data, you can make the whole operation synchronized either by using the synchronized keyword or distributed lock.

To be honest, I'm not satisfied with JPA on the batch upsert, it's not atomic, not safe, and not fast. I think it's because there is not a commonly implemented upsert pattern in all DBMSs and thus ORM gives up on the function partly. I sincerely recommend you to use upsert operation with raw SQL implemented by the database your system chose, like MYSQL INSERT... ON DUPLICATE KEY UPDATE Statement . Sample code below:

  public int upsert(List<Employee> employees) {

      String sqlPattern = "INSERT INTO employee (id, code, name)\n" 
              + "VALUES %s\n"
              + "ON DUPLICATE KEY UPDATE name      = values(name);";

      List<String> values = employees.stream()
              // build something like (1, '10001', 'Foo Bar')
              .map(employee -> buildRecord(employee))
              .collect(Collectors.toList());

      String sql = String.format(sqlPattern, StringUtils.join(values, ", "));

      return jdbcTemplate.update(sql, Map.of());
  }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM