简体   繁体   English

Spring数据JPA太慢

[英]Spring Data JPA Is Too Slow

I recently switched my app to Spring Boot 2. I rely on Spring Data JPA to handle all transactions and I noticed a huge speed difference between this and my old configuration.我最近将我的应用程序切换到 Spring Boot 2。我依靠 Spring 数据 JPA 来处理所有交易,我注意到这与我的旧配置之间存在巨大的速度差异。 Storing around 1000 elements was being done in around 6s and now it's taking over 25 seconds.存储大约 1000 个元素大约需要 6 秒,现在需要超过 25 秒。 I have seen SO posts about batching with Data JPA but none of these worked.我看过关于使用数据 JPA 进行批处理的 SO 帖子,但这些都没有用。

Let me show you the 2 configurations:让我向您展示 2 个配置:

The entity (common to both):实体(两者共有):

    @Entity
    @Table(name = "category")
    public class CategoryDB implements Serializable
    {
        private static final long serialVersionUID = -7047292240228252349L;

        @Id
        @Column(name = "category_id", length = 24)
        private String category_id;

        @Column(name = "category_name", length = 50)
        private String name;

        @Column(name = "category_plural_name", length = 50)
        private String pluralName;

        @Column(name = "url_icon", length = 200)
        private String url;

        @Column(name = "parent_category", length = 24)
        @JoinColumn(name = "parent_category", referencedColumnName = "category_id")
        private String parentID;

        //Getters & Setters

     }

Old Repository (showing an insert only):旧存储库(仅显示插入):

@Override
    public Set<String> insert(Set<CategoryDB> element)
    {
        Set<String> ids = new HashSet<>();
        Transaction tx = session.beginTransaction();
        for (CategoryDB category : element)
        {
            String id = (String) session.save(category);
            ids.add(id);
        }
        tx.commit();
        return ids;
    }

Old Hibernate XML Config File:旧 Hibernate XML 配置文件:

    <property name="show_sql">true</property>
    <property name="format_sql">true</property>

    <!-- connection information -->
    <property name="hibernate.connection.driver_class">com.mysql.cj.jdbc.Driver</property>
    <property name="hibernate.dialect">org.hibernate.dialect.MySQLDialect</property>

    <!-- database pooling information -->
    <property name="connection_provider_class">org.hibernate.connection.C3P0ConnectionProvider</property>
    <property name="hibernate.c3p0.min_size">5</property>
    <property name="hibernate.c3p0.max_size">100</property>
    <property name="hibernate.c3p0.timeout">300</property>
    <property name="hibernate.c3p0.max_statements">50</property>
    <property name="hibernate.c3p0.idle_test_period">3000</property>

Old Statistics:旧统计:

18949156 nanoseconds spent acquiring 2 JDBC connections;
5025322 nanoseconds spent releasing 2 JDBC connections;
33116643 nanoseconds spent preparing 942 JDBC statements;
3185229893 nanoseconds spent executing 942 JDBC statements;
0 nanoseconds spent executing 0 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
3374152568 nanoseconds spent executing 1 flushes (flushing a total of 941 entities and 0 collections);
6485 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)

New Repository:新存储库:

@Repository
public interface CategoryRepository extends JpaRepository<CategoryDB,String>
{
    @Query("SELECT cat.parentID FROM CategoryDB cat WHERE cat.category_id = :#{#category.category_id}")
    String getParentID(@Param("category") CategoryDB category);
}

And I'm using the saveAll() in my service.我在我的服务中使用saveAll()

New application.properties:新的应用程序.properties:

spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver

spring.datasource.hikari.connection-timeout=6000
spring.datasource.hikari.maximum-pool-size=10

spring.jpa.properties.hibernate.show_sql=true
spring.jpa.properties.hibernate.format_sql=true
spring.jpa.properties.hibernate.generate_statistics = true
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.MySQLDialect
spring.jpa.properties.hibernate.jdbc.batch_size=50
spring.jpa.properties.hibernate.order_inserts=true

New Statistics:新统计数据:

24543605 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
136919170 nanoseconds spent preparing 942 JDBC statements;
5457451561 nanoseconds spent executing 941 JDBC statements;
19985781508 nanoseconds spent executing 19 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
20256178886 nanoseconds spent executing 3 flushes (flushing a total of 2823 entities and 0 collections);
0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)

Probably, I'm misconfiguring something on behalf on Spring. This is a huge performance difference and I'm on a dead end.可能,我代表 Spring 配置错误。这是一个巨大的性能差异,我已经走投无路了。 Any hints on what is going wrong here are very appreciated.非常感谢任何关于这里出了什么问题的提示。

Let's merge the statistics so they can be easily compared.让我们合并统计数据,以便轻松比较它们。 Old rows are prefixed with o , new ones with n .旧行以o为前缀,新行以n为前缀。 Rows with a count of 0 are ignored.计数为 0 的行将被忽略。 Nanoseconds measurements are formatted so that milliseconds can are before a纳秒测量被格式化,因此毫秒可以在. .

o:    18 949156 nanoseconds spent acquiring 2 JDBC connections;
n:    24 543605 nanoseconds spent acquiring 1 JDBC connections;

o:    33 116643 nanoseconds spent preparing 942 JDBC statements;
n:   136 919170 nanoseconds spent preparing 942 JDBC statements;

o:  3185 229893 nanoseconds spent executing 942 JDBC statements;
n:  5457 451561 nanoseconds spent executing 941 JDBC statements; //loosing ~2sec

o:            0 nanoseconds spent executing 0 JDBC batches;
n: 19985 781508 nanoseconds spent executing 19 JDBC batches; // loosing ~20sec

o:  3374 152568 nanoseconds spent executing 1 flushes (flushing a total of 941 entities and 0 collections);
n: 20256 178886 nanoseconds spent executing 3 flushes (flushing a total of 2823 entities and 0 collections); // loosing ~20sec, processing 3 times the entities

o:         6485 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)
n:            0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)

The following seem to be the relevant points:以下似乎是相关点:

  • The new version has 19 batches which take 20sec which don't exist in the old version at all.新版本有 19 个批次,耗时 20 秒,在旧版本中根本不存在。

  • The new version has 3 flushes instead of 1, which take together 20 sec more or about 6 times as long.新版本有 3 次冲洗而不是 1 次,总共需要 20 秒或大约 6 倍的时间。 This is probably more or less the same extra time as the batches since they are most certainly part of these flushes.这可能与批次的额外时间或多或少相同,因为它们肯定是这些刷新的一部分。

Although batches are supposed to make things faster, there are reports where they make things slower, especially with MySql: Why Spring's jdbcTemplate.batchUpdate() so slow?尽管批处理应该使事情变得更快,但有报告称它们使事情变得更慢,尤其是 MySql: 为什么 Spring 的 jdbcTemplate.batchUpdate() 这么慢?

This brings us to a couple of things you can try/investigate:这给我们带来了一些你可以尝试/调查的事情:

  • Disable batching, in order to test if you are actually suffering from some kind of slow batch problem.禁用批处理,以测试您是否真的遇到某种缓慢的批处理问题。
  • Use the linked SO post in order to speed up batching.使用链接的 SO 帖子以加快批处理速度。
  • log the SQL statements that actually get executed in order to find the difference.记录实际执行的 SQL 语句以找出差异。 Since this will result in rather lengthy logs to manipulate, try extracting only the SQL statements in two files and comparing them with a diff tool.由于这将导致操作相当长的日志,因此请尝试仅提取两个文件中的 SQL 语句并使用差异工具进行比较。
  • log flushes in order to get ideas why extra flushes are triggered.记录刷新以了解触发额外刷新的原因。
  • use breakpoints and a debugger or extra logging to find out what entities are getting flushed and why you have way more entities in the second variant.使用断点和调试器或额外的日志记录来找出正在刷新的实体以及为什么在第二个变体中有更多实体。

All the proposals above operate on JPA.上述所有建议都在 JPA 上运行。 But your statistics and question content suggest that you are doing simple inserts in a single or few tables.但是您的统计数据和问题内容表明您正在一个或几个表中进行简单的插入。 Doing this with on JDBC, eg with a JdbcTemplate might be more efficient and at least easier to understand.在 JDBC 上执行此操作,例如使用JdbcTemplate可能更有效且至少更易于理解。

You can use jdbc template directly it is much fast than data jpa.可以直接使用jdbc模板,比数据jpa快多了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM