简体   繁体   中英

Hibernate N+1 from select across multiple tables

Given the following hibernate query:

String sql = "select distinct changeset " +
    "from Changeset changeset " +
    "join fetch changeset.changeEntries as changeEntry " +
    "join fetch changeEntry.repositoryEntity as repositoryEntity " +
    "join fetch repositoryEntity.project as project " +
    "join fetch changeset.author as changesetAuthor " +
    "where project.id = :projectID ";

Why is this resulting in an N+1 problem?

I expect this to generate the following single SQL statement (or something similar)

select *
  from Changeset 
  inner join changeEntry on changeset.id = changeEntry.changeset_id
  inner join repositoryEntity on changeEntry.repositoryentity_id = repositoryentity.id
  inner join project on repositoryentity.project_id = project.id
where project.id = ?

Instead, I see many many select statements firing.

The data model here looks like this:

alt text http://img29.imageshack.us/img29/4123/uml.png

I would like the full object graph returned from the Select statement in a single trip to the database, which is why I'm explicitly using "fetch" in the hibernate query.

The Hibernate log statements are as follows:

Hibernate: select distinct changeset0_.id as id2_0_, changeentr1_.id as id1_1_, repository2_.id as id9_2_, project3_.id as id6_3_, user4_.id as id7_4_, changeset0_.author_id as author5_2_0_, changeset0_.createDate as createDate2_0_, changeset0_.message as message2_0_, changeset0_.revision as revision2_0_, changeentr1_.changeType as changeType1_1_, changeentr1_.changeset_id as changeset4_1_1_, changeentr1_.diff as diff1_1_, changeentr1_.repositoryEntity_id as reposito5_1_1_, changeentr1_.repositoryEntityVersion_id as reposito6_1_1_, changeentr1_.sourceChangeEntry_id as sourceCh7_1_1_, changeentr1_.changeset_id as changeset4_0__, changeentr1_.id as id0__, repository2_.project_id as connecti6_9_2_, repository2_.name as name9_2_, repository2_.parent_id as parent7_9_2_, repository2_.path as path9_2_, repository2_.state as state9_2_, repository2_.type as type9_2_, project3_.projectName as connecti2_6_3_, project3_.driverName as driverName6_3_, project3_.isAnonymous as isAnonym4_6_3_, project3_.lastUpdatedRevision as lastUpda5_6_3_, project3_.password as password6_3_, project3_.url as url6_3_, project3_.username as username6_3_, user4_.username as username7_4_, user4_.email as email7_4_, user4_.name as name7_4_, user4_.password as password7_4_, user4_.principles as principles7_4_, user4_.userType as userType7_4_ from Changeset changeset0_ inner join ChangeEntry changeentr1_ on changeset0_.id=changeentr1_.changeset_id inner join RepositoryEntity repository2_ on changeentr1_.repositoryEntity_id=repository2_.id inner join project project3_ on repository2_.project_id=project3_.id inner join users user4_ on changeset0_.author_id=user4_.id where project3_.id=? order by changeset0_.revision desc
Hibernate: select repository0_.id as id10_9_, repository0_.changeEntry_id as changeEn2_10_9_, repository0_.repositoryEntity_id as reposito3_10_9_, changeentr1_.id as id1_0_, changeentr1_.changeType as changeType1_0_, changeentr1_.changeset_id as changeset4_1_0_, changeentr1_.diff as diff1_0_, changeentr1_.repositoryEntity_id as reposito5_1_0_, changeentr1_.repositoryEntityVersion_id as reposito6_1_0_, changeentr1_.sourceChangeEntry_id as sourceCh7_1_0_, changeset2_.id as id2_1_, changeset2_.author_id as author5_2_1_, changeset2_.createDate as createDate2_1_, changeset2_.message as message2_1_, changeset2_.revision as revision2_1_, user3_.id as id7_2_, user3_.username as username7_2_, user3_.email as email7_2_, user3_.name as name7_2_, user3_.password as password7_2_, user3_.principles as principles7_2_, user3_.userType as userType7_2_, repository4_.id as id9_3_, repository4_.project_id as connecti6_9_3_, repository4_.name as name9_3_, repository4_.parent_id as parent7_9_3_, repository4_.path as path9_3_, repository4_.state as state9_3_, repository4_.type as type9_3_, project5_.id as id6_4_, project5_.projectName as connecti2_6_4_, project5_.driverName as driverName6_4_, project5_.isAnonymous as isAnonym4_6_4_, project5_.lastUpdatedRevision as lastUpda5_6_4_, project5_.password as password6_4_, project5_.url as url6_4_, project5_.username as username6_4_, repository6_.id as id9_5_, repository6_.project_id as connecti6_9_5_, repository6_.name as name9_5_, repository6_.parent_id as parent7_9_5_, repository6_.path as path9_5_, repository6_.state as state9_5_, repository6_.type as type9_5_, repository7_.id as id10_6_, repository7_.changeEntry_id as changeEn2_10_6_, repository7_.repositoryEntity_id as reposito3_10_6_, repository8_.id as id9_7_, repository8_.project_id as connecti6_9_7_, repository8_.name as name9_7_, repository8_.parent_id as parent7_9_7_, repository8_.path as path9_7_, repository8_.state as state9_7_, repository8_.type as type9_7_, changeentr9_.id as id1_8_, changeentr9_.changeType as changeType1_8_, changeentr9_.changeset_id as changeset4_1_8_, changeentr9_.diff as diff1_8_, changeentr9_.repositoryEntity_id as reposito5_1_8_, changeentr9_.repositoryEntityVersion_id as reposito6_1_8_, changeentr9_.sourceChangeEntry_id as sourceCh7_1_8_ from RepositoryEntityVersion repository0_ left outer join ChangeEntry changeentr1_ on repository0_.changeEntry_id=changeentr1_.id left outer join Changeset changeset2_ on changeentr1_.changeset_id=changeset2_.id left outer join users user3_ on changeset2_.author_id=user3_.id left outer join RepositoryEntity repository4_ on changeentr1_.repositoryEntity_id=repository4_.id left outer join project project5_ on repository4_.project_id=project5_.id left outer join RepositoryEntity repository6_ on repository4_.parent_id=repository6_.id left outer join RepositoryEntityVersion repository7_ on changeentr1_.repositoryEntityVersion_id=repository7_.id left outer join RepositoryEntity repository8_ on repository7_.repositoryEntity_id=repository8_.id left outer join ChangeEntry changeentr9_ on changeentr1_.sourceChangeEntry_id=changeentr9_.id where repository0_.id=?

The 2nd one is repeated many times - for a result set of 17 objects, the 2nd statement executed 521 times.

I suspect this is as a result of the parent/child relationship in the RepositoryEntity object. For the purposes of this select, I actually only require the parent object fetched.

Any suggestions?

Unless you've mapped the collections as lazily loaded, when you grab the object, regardless of the additional HQL, its going to generate multiple selects. Change your connection mappings to be lazy loaded. Additionally, unless connectionDetails can never be empty, I suggest you change the last join to a left join.

The first SQL you've posted is the one you've expected (not counting the inner join users that's missing in your "expected" SQL - but it's present in your HQL so that's correct).

The second SQL is (simplified for clarity):

select *
  from RepositoryEntityVersion repository0_
  left outer join ChangeEntry changeentr1_ on repository0_.changeEntry_id=changeentr1_.id
  left outer join Changeset changeset2_ on changeentr1_.changeset_id=changeset2_.id
  left outer join users user3_ on changeset2_.author_id=user3_.id
  left outer join RepositoryEntity repository4_ on changeentr1_.repositoryEntity_id=repository4_.id
  left outer join project project5_ on repository4_.project_id=project5_.id
  left outer join RepositoryEntity repository6_ on repository4_.parent_id=repository6_.id
  left outer join RepositoryEntityVersion repository7_ on changeentr1_.repositoryEntityVersion_id=repository7_.id
  left outer join RepositoryEntity repository8_ on repository7_.repositoryEntity_id=repository8_.id
  left outer join ChangeEntry changeentr9_ on changeentr1_.sourceChangeEntry_id=changeentr9_.id
 where repository0_.id=?

The base table here is RepositoryEntityVersion which is not on your diagram; I'm guessing it's mapped as one-to-many on RepositoryEntity ? I'm further guessing that it's mapped for eager fetching, which is where your problem lies.

You need to either map it as lazy or explicitly mention it in your query with join fetch . The latter, however, may be undesirable due to both volume of data that may be involved and (possibly) duplicate root entity instances being returned. distinct doesn't always help; look at the SQL you've posted and you'll see it's being applied to ALL columns returned across all tables thus making it rather pointless.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM