简体   繁体   中英

Eclipselink - detached entities memory leak

Setup

We are currently using wildfly with eclipselink as JPA implementation in JakartaEE application. Application itself is RESTful web server with REST, Service and DAO layers. DAO is the only layer that is using EntityManager. We are always detaching entities for various reasons.

  • To prevent eclipselink from automatic state checking and flushing changes to database
  • To prevent eclipselink from reusing same object on multiple reads ...

However by using this approach we have noticed spike in memory usage that in some cases lead to OutOfMemory errors.

Diagnostics

Using VisualVM we have pinpointed problem to be having a great number of instances of entities in memory.

Test code

This is sample of code we are experiencing problems with (migration of some historic data)

LinkedList<SomeEntity> entities; //Here is loaded set of entities to process
while(!entities.isEmpty()) {
    SomeEntity entity = entities.removeFirst(); //We are iterating in quee fashion to allow GC to remove already processed items from memory
    if (entity.getItems().isEmpty()) {
        //this call is transactional
        entityService.delete(entity.getId());
     } else if (entity.getItems().stream().anyMatch(item -> item.getQuantity() > 0.0)){
        //DO SOME CHANGES ON ENTITY
        //this call is transactional
        entityService.update(operation);
     }
     entity = null;
}
entities = null;

Observations

  • While profiling memory usage we can see ever increasing count of entity classes in memory. It is not the same entity that is being worked with in test code, but it is entity, that is referenced at most time by other objects. Sometimes part of them are cleared but overall number increases after some time
  • Number of instances greatly outnumbers records in database
  • This means that every time object is referenced in relation, new instance is created (this is OK)
  • When we have created heap dump and looked from where the objects are referenced only eclipselink internal structures shows like relationshipSourceObject in org.eclipse.persistence.internal.indirection.UnitOfWorkQueryValueHolder#90312 owner in org.eclipse.persistence.internal.descriptors.changetracking.AttributeChangeListener#26713 , ...)

What we have tried

None of this helped:

  • Setting eclipselink.cache.type.default to WEAK, SOFT or even NONE
  • Manually calling EntityManager.clear at end of the while

In my understanding WEAK should be enough to prevent eclipselink from storing references for too long and prevent GC. But it is stored somewhere anyway and since that references are accessible from GC roots they are newer cleared. Can anyone explain this behavior or point me at direction where to look?

EDITS

Addressing comment and Chris answer. More information about how we use EM and transactions.

We are detaching using EntityManager.detach method and references ( @OneToMany , @ManyToMany , etc) have Cascade.DETACH applied. Loading necessary lazy loaded references is done prior to detach.

I agree about the part about re-fetching entities. I would not mind having multiple instances of the same entity in memory for some time. My problem is why it is not garbage collected.

List of entities in sample code is loaded in one transaction on subsequent database UPDATE or DELETE (this also fetches some bits into memory creating more instances) is another transaction per entity. I would probably expect most of the heap used during the initial call and then slowly clearing or remaining roughly same.

About using EntityManager

We are using wildfly as JakartaEE container. By default it is shipped with hibernate as JPA provider but we have added eclipselink as module and configured provider in persistence.xml

According to documentation container managed EntityManager creates instances as needed.

Are you caching entities? Clear is not enough to allow you to effectively cache, as if that is what you are trying, is likely related to your current issue. Everything loaded from a EntityManager has are reference to that EntityManager, so I would guess that you are reading in a large list of entities that are partially fetched and caching them, then using EntityManager.clear() to try to detach them.

Those entities are then no longer 'managed' but still reference the EntityManager. As soon as you fetch something, such as the entity.getItems() call you've shown in code, assuming this is a standard OneToMany with a back pointer which defaults to be lazily loaded, this will force fetching all 'items' into memory. As they have a back reference and 'this' entity isn't referenced by the EntityManager, the Item then has to refetch the entity. So you now have two instances of the same Entity in memory Entity1' -> Item1 -> Entity1.

This can easily build up with more complex object graph and repeated clear calls.

This can be, not solved, but the overhead reduced by reducing the scope of what you do in an EntityManager, so that it can be reused for identity purposes related to that object graph, and garbage collected (and cleared by GC) when objects it was used to read are also cleared by GC.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM