JPA query optimization

Question

We have a JavaEE Web Application which runs on GlassFish 4.1. It performed well with small amount of data, but now data becomes more and more. The result is that a simple request like loading of a document takes about 1 minute to load, because it loads unnecessarily almost the whole database.

These are the entities:

Document Entity:

@Entity
@JsonIdentityInfo(generator=JSOGGenerator.class)
@NamedEntityGraph(
    name = "graph.Document.single",
    attributeNodes = {
        @NamedAttributeNode(value = "project", subgraph = "projectSubgraph")
    },
    subgraphs = {
        @NamedSubgraph(
            name = "projectSubgraph",
            attributeNodes = {
                @NamedAttributeNode("users")
            }
        )
    }
)
public class Document extends BaseEntity {

    @JsonView({ View.Documents.class, View.Projects.class })
    @Column(name = "Name")
    private String name;

    @JsonView({ })
    @JsonProperty(access = Access.WRITE_ONLY)
    @Column(name = "Text", columnDefinition = "TEXT")
    private String text;

    @JsonView({ View.Documents.class })
    @ManyToOne(cascade = { CascadeType.PERSIST, CascadeType.MERGE },
                optional = false)
    @JoinColumn(name = "project_fk")
    private Project project;

    @JsonView({ View.Documents.class, View.Projects.class })
    @OneToMany(cascade = { CascadeType.PERSIST, CascadeType.MERGE, CascadeType.REMOVE },
                mappedBy = "document",
                fetch = FetchType.EAGER)
    private Set<State> states = new HashSet<>();

    @JsonView({ })
    @OneToMany(cascade = { CascadeType.PERSIST, CascadeType.MERGE, CascadeType.REMOVE },
                fetch = FetchType.LAZY)
    @JoinTable(
        name="DOCUMENT_DEFAULTANNOTATIONS",
        joinColumns={@JoinColumn(name="DOC_ID", referencedColumnName="id")},
        inverseJoinColumns={@JoinColumn(name="DEFANNOTATION_ID", referencedColumnName="id")})
    private Set<Annotation> defaultAnnotations = new HashSet<>();

    ...
}

Project Entity:

@Entity
@JsonIdentityInfo(generator=JSOGGenerator.class)
public class Project extends BaseEntity {

    @JsonView({ View.Projects.class })
    @Column(name = "Name", unique = true)
    private String name;

    @JsonView({ View.Projects.class })
    @OneToMany(mappedBy = "project",
                cascade = { CascadeType.PERSIST, CascadeType.MERGE, CascadeType.REMOVE },
                fetch = FetchType.EAGER)
    private Set<Document> documents = new HashSet<>();

    @JsonView({ View.Projects.class })
    @ManyToMany(cascade = { CascadeType.PERSIST, CascadeType.MERGE }, fetch = FetchType.EAGER)
    @JoinTable(
        name="PROJECTS_MANAGER",
        joinColumns={@JoinColumn(name="PROJECT_ID", referencedColumnName="id")},
        inverseJoinColumns={@JoinColumn(name="MANAGER_ID", referencedColumnName="id")})
    private Set<Users> projectManager = new HashSet<>();

    @JsonView({ View.Projects.class })
    @ManyToMany(cascade = { CascadeType.PERSIST, CascadeType.MERGE }, fetch = FetchType.EAGER)
    @JoinTable(
        name="PROJECTS_WATCHINGUSERS",
        joinColumns={@JoinColumn(name="PROJECT_ID", referencedColumnName="id")},
        inverseJoinColumns={@JoinColumn(name="WATCHINGUSER_ID", referencedColumnName="id")})
    private Set<Users> watchingUsers = new HashSet<>();

    @JsonView({ View.Projects.class })
    @ManyToMany(mappedBy = "projects",
                cascade = { CascadeType.PERSIST, CascadeType.MERGE },
                fetch = FetchType.EAGER)
    private Set<Users> users = new HashSet<>();

    @JsonView({ View.Projects.class })
    @ManyToOne(cascade = { CascadeType.PERSIST, CascadeType.MERGE },
                fetch = FetchType.EAGER)
    @JoinColumn(name="Scheme", nullable = false)
    private Scheme scheme;

    ...
}

The data model is pretty complex and has partially cyclic structures.

The corresponding DocumentDAO :

@Stateless
@TransactionAttribute(TransactionAttributeType.MANDATORY)
public class DocumentDAO extends BaseEntityDAO<Document> {

    public DocumentDAO() {
        super(Document.class);
    }

    public Document getDocumentById(Long docId) {

        EntityGraph graph = this.em.getEntityGraph("graph.Document.single");

        TypedQuery query = em.createQuery("SELECT d.id AS id, d.name AS name, d.project AS project " +
                                    "FROM Document d " +
                                    "JOIN FETCH d.project " +
                                    "WHERE d.id = :id ", Document.class);
        query.setParameter("id", docId);
        //query.setHint("javax.persistence.loadgraph", graph);
        //query.setHint("javax.persistence.fetchgraph", graph); //evokes an exception
        Object[] result  = (Object[]) query.getSingleResult();

        Document doc = new Document();
        doc.setId((Long) result[0]);
        doc.setName((String) result[1]);
        doc.setProject((Project) result[2]);

        return doc;
    }

}

Before a simple em.find(Document.class, docId) performed slow as well. So the next trial was to create a NamedEntityGraph to override the fetching strategy. Passing the graph as a hint ( em.find(Document.class, docId, hints) ) didn't change anything. The same behaviour with writing a JPQL query like in the DocumentDAO. Assigning the NamedEntityGraph as a hint only evoked "org.eclipse.persistence.exceptions.QueryException.fetchGroupNotSupportOnReportQuery: Fetch group cannot be set on report query" . I enabled the EclipseLink logging and I can see that the request evokes tons of unnecessary SQL queries.

The aim is only to return a Document object containing the id, name and the corresponding project object. The project object should only contain the users. I am also wondering why the NamedEntityGraph didn't change anything or am I using it not properly?

We use EclipseLink 2.6.2 and PostgreSQL.

Update:

Snippets from the logging:

[2016-06-05T17:50:27.875+0200] [glassfish 4.1] [FINE] [] [org.eclipse.persistence.session./file:/Users/timtoheus/NetBeansProjects/discanno/target/discanno-1.0/WEB-INF/classes/_DiscAnnoPU.sql] [tid: _ThreadID=31 _ThreadName=http-listener-1(3)] [timeMillis: 1465141827875] [levelValue: 500] [[
  SELECT t1.ID, t1.EndS, t1.NotSure, t1.StartS, t1.Text, t1.document_fk, t1.targetType_fk, t1.user_fk FROM DOCUMENT_DEFAULTANNOTATIONS t0, ANNOTATION t1 WHERE ((t0.DOC_ID = ?) AND (t1.ID = t0.DEFANNOTATION_ID))
    bind => [38]]]

[2016-06-05T17:50:27.877+0200] [glassfish 4.1] [FINE] [] [org.eclipse.persistence.session./file:/Users/timtoheus/NetBeansProjects/discanno/target/discanno-1.0/WEB-INF/classes/_DiscAnnoPU.sql] [tid: _ThreadID=31 _ThreadName=http-listener-1(3)] [timeMillis: 1465141827877] [levelValue: 500] [[
  SELECT t1.ID, t1.EndS, t1.NotSure, t1.StartS, t1.Text, t1.document_fk, t1.targetType_fk, t1.user_fk FROM DOCUMENT_DEFAULTANNOTATIONS t0, ANNOTATION t1 WHERE ((t0.DOC_ID = ?) AND (t1.ID = t0.DEFANNOTATION_ID))
    bind => [39]]]

... 

[2016-06-05T17:50:27.771+0200] [glassfish 4.1] [FINE] [] [org.eclipse.persistence.session./file:/Users/timtoheus/NetBeansProjects/discanno/target/discanno-1.0/WEB-INF/classes/_DiscAnnoPU.sql] [tid: _ThreadID=31 _ThreadName=http-listener-1(3)] [timeMillis: 1465141827771] [levelValue: 500] [[
  SELECT t1.ID, t1.LABEL_LabelId FROM ANNOTATION_LABELMAP t0, LABELLABELSETMAP t1 WHERE ((t0.ANNOTATION_ID = ?) AND (t1.ID = t0.MAP_ID))
    bind => [53649]]]

[2016-06-05T17:50:27.773+0200] [glassfish 4.1] [FINE] [] [org.eclipse.persistence.session./file:/Users/timtoheus/NetBeansProjects/discanno/target/discanno-1.0/WEB-INF/classes/_DiscAnnoPU.sql] [tid: _ThreadID=31 _ThreadName=http-listener-1(3)] [timeMillis: 1465141827773] [levelValue: 500] [[
  SELECT t1.ID, t1.LABEL_LabelId FROM ANNOTATION_LABELMAP t0, LABELLABELSETMAP t1 WHERE ((t0.ANNOTATION_ID = ?) AND (t1.ID = t0.MAP_ID))
    bind => [53650]]]

...

[2016-06-05T17:56:50.881+0200] [glassfish 4.1] [FINE] [] [org.eclipse.persistence.session./file:/Users/timtoheus/NetBeansProjects/discanno/target/discanno-1.0/WEB-INF/classes/_DiscAnnoPU.sql] [tid: _ThreadID=30 _ThreadName=http-listener-1(2)] [timeMillis: 1465142210881] [levelValue: 500] [[
  SELECT t1.ID, t1.LABEL_LabelId FROM ANNOTATION_LABELMAP t0, LABELLABELSETMAP t1 WHERE ((t0.ANNOTATION_ID = ?) AND (t1.ID = t0.MAP_ID))
    bind => [44220]]]

[2016-06-05T17:56:50.886+0200] [glassfish 4.1] [FINE] [] [org.eclipse.persistence.session./file:/Users/timtoheus/NetBeansProjects/discanno/target/discanno-1.0/WEB-INF/classes/_DiscAnnoPU.sql] [tid: _ThreadID=30 _ThreadName=http-listener-1(2)] [timeMillis: 1465142210886] [levelValue: 500] [[
  SELECT t1.ID, t1.LABEL_LabelId FROM ANNOTATION_LABELMAP t0, LABELLABELSETMAP t1 WHERE ((t0.ANNOTATION_ID = ?) AND (t1.ID = t0.MAP_ID))
    bind => [44221]]]

...

The total amount of queries is about 100.000. The logging refers to some other Entities which are not needed for this request. The end result should be about 500kb and not 7.1mb.

Chrome console:

Answer 1

I don't know about your data, but this is what I think is happening -

You have the following eager associations

document -> project  (manyToOne is eager by default)
document -> states
project -> documents
project -> users 
user -> ... (this is not shown in question, but there could be other eager associations)

After you load a document with it's corresponding project -

all the project documents are fetched
all project users are fetched
all document states are fetched
for each document in step 3, document states are fetched
for each user in step 2, all eager associations are loaded

You see where I'm going. I see this as a combination of (n+1) issue and excessive use of eager loading even when you don't need it.

I would say 'Eager' fetching strategy is not ideal for complex object graphs. I would make most of the associations as lazy and load object graphs using 'join fetch' statements in JPQL.

JPA query optimization

Question

1 answers

solution1
4 ACCPTED 2016-06-05 16:11:36

JPA query optimization

Question

1 answers

solution1 4 ACCPTED 2016-06-05 16:11:36

solution1
4 ACCPTED 2016-06-05 16:11:36