简体   繁体   English

Spring Boot JPA/JDBC 批处理 findById 有效,但 findOneByX 无效

[英]Spring Boot JPA/JDBC batching findById works but findOneByX not working

I am using Spring Boot JPA, I have enabled batching by ensuring the following lines are in the my application.properties:我正在使用 Spring Boot JPA,我通过确保以下行在我的 application.properties 中启用了批处理:

spring.jpa.properties.hibernate.jdbc.batch_size=1000
spring.jpa.properties.hibernate.order_inserts=true
spring.jpa.properties.hibernate.order_updates=true

I now have a loop where I am doing a findById on an entity and then saving that entity like so:我现在有一个循环,我在一个实体上执行 findById,然后像这样保存该实体:

var entity = dao.findById(id)
// Do some stuff
dao.save(entity) //This line is not really required but I am being explicit here

Putting the above in a loop I see that the save(update) statements are batched to the DB.将上述内容放在一个循环中,我看到 save(update) 语句被批处理到数据库。 My issue is that if I do a findOneByX where X is a property on the entity then the batching does not work (batch size of 1), requests get sent one at a time ie:我的问题是,如果我执行 findOneByX ,其中 X 是实体上的属性,则批处理不起作用(批处理大小为 1),请求一次发送一个,即:

var entity = dao.findOneByX(x)
// Do some stuff
dao.save(entity)

Why is this happening?为什么会这样? Is JPA/JDBC only equipped to batch when we findById only?当我们仅 findById 时,JPA/JDBC 是否仅配备批处理?

Solution解决方案

Refer to How to implement batch update using Spring Data Jpa?请参阅如何使用 Spring Data Jpa 实现批量更新?

  1. Fetch the list of entity you want to update to a list获取要更新到列表的实体列表
  2. Update as desired根据需要更新
  3. Call saveAll调用saveAll

PS: beware of memory usage for this solution, when your list size is large. PS:当您的列表很大时,请注意此解决方案的内存使用情况。


Why findById and findOneByX behave differently?为什么findByIdfindOneByX的行为不同?

As suggested by M. Deinum , hibernate will auto flush your change正如M. Deinum所建议的, 休眠将自动刷新您的更改

prior to executing a JPQL/HQL query that overlaps with the queued entity actions在执行与排队实体操作重叠的 JPQL/HQL 查询之前

Since both findById and findOneByX will execute query, what is the different between them?既然findByIdfindOneByX都会执行查询,那么它们之间有什么不同呢?

First, the reason to flush is to make sure session and Database are in same state, hence you can get consistent result from session cache(if available) and database.首先,刷新的原因是确保会话和数据库处于相同状态,因此您可以从会话缓存(如果可用)和数据库中获得一致的结果。

When calling findById , hibernate will try to get it from session cache, if entity is not available, fetch it from database.当调用findById时,hibernate 将尝试从会话缓存中获取它,如果实体不可用,则从数据库中获取它。 While for findOneByX , we always need to fetch it from database as it is impossible to cache entity by X.而对于findOneByX ,我们总是需要从数据库中获取它,因为不可能通过 X 缓存实体。

Then we can consider below example:然后我们可以考虑下面的例子:

@Entity
@Getter
@Setter
@NoArgsConstructor
@AllArgsConstructor
public class Student {
    @Id
    private Long id;
    private String name;
    private int age;
}

Suppose we have假设我们有

id ID name姓名 age年龄
1 1 Amy艾米 10 10
@Transactional
public void findByIdAndUpdate() {
    dao.save(new Student(2L, "Dennis", 14));
    // no need to flush as we can get from session
    for (int i = 0; i < 100; i++) {
        Student dennis = dao.findById(2L).orElseThrow();
        dennis.setAge(i);
        dao.save(dennis);
    }
}

Will result in会导致

412041 nanoseconds spent executing 2 JDBC batches;

1 for insert 1 one for update. 1 用于插入 1 用于更新。

  • Hibernate: I'm sure that result can be fetch from session (without flush) or database if record is not in session, so let's skip flushing as it is slow! Hibernate:如果记录不在会话中,我确信结果可以从会话(没有刷新)或数据库中获取,所以让我们跳过刷新,因为它很慢!
@Transactional
public void findOneByNameAndUpdate() {
    Student amy = dao.findOneByName("Amy");
    // this affect later query
    amy.setName("Tammy");
    dao.save(amy);
    for (int i = 0; i < 100; i++) {
        // do you expect getting result here?
        Student tammy = dao.findOneByName("Tammy");
        // Hibernate not smart enough to notice this will not affect later result.
        tammy.setAge(i);
        dao.save(tammy);
    }
}

Will result in会导致

13964088 nanoseconds spent executing 101 JDBC batches;

1 for first update and 100 for update in loop. 1 表示第一次更新,100 表示循环更新。

  • Hibernate: Hmm, I'm not sure if stored update will affect the result, better flush the update or I will be blamed by developer. Hibernate:嗯,我不确定存储的更新是否会影响结果,最好刷新更新,否则我会被开发人员指责。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM