[英]Eclipselink - detached entities memory leak
We are currently using wildfly with eclipselink as JPA implementation in JakartaEE application.我们目前正在使用带有 eclipselink 的wildfly 作为 JakartaEE 应用程序中的 JPA 实现。 Application itself is RESTful web server with REST, Service and DAO layers.应用程序本身是具有 REST、服务和 DAO 层的 RESTful Web 服务器。 DAO is the only layer that is using EntityManager. DAO 是唯一使用 EntityManager 的层。 We are always detaching entities for various reasons.我们总是出于各种原因分离实体。
However by using this approach we have noticed spike in memory usage that in some cases lead to OutOfMemory
errors.然而,通过使用这种方法,我们注意到内存使用量激增,在某些情况下会导致OutOfMemory
错误。
Using VisualVM we have pinpointed problem to be having a great number of instances of entities in memory.使用 VisualVM,我们确定了内存中存在大量实体实例的问题。
This is sample of code we are experiencing problems with (migration of some historic data)这是我们遇到问题的代码示例(一些历史数据的迁移)
LinkedList<SomeEntity> entities; //Here is loaded set of entities to process
while(!entities.isEmpty()) {
SomeEntity entity = entities.removeFirst(); //We are iterating in quee fashion to allow GC to remove already processed items from memory
if (entity.getItems().isEmpty()) {
//this call is transactional
entityService.delete(entity.getId());
} else if (entity.getItems().stream().anyMatch(item -> item.getQuantity() > 0.0)){
//DO SOME CHANGES ON ENTITY
//this call is transactional
entityService.update(operation);
}
entity = null;
}
entities = null;
relationshipSourceObject in org.eclipse.persistence.internal.indirection.UnitOfWorkQueryValueHolder#90312
owner in org.eclipse.persistence.internal.descriptors.changetracking.AttributeChangeListener#26713
, ...)当我们创建堆转储并从对象被引用的位置查看时,只有 eclipselink 内部结构显示如relationshipSourceObject in org.eclipse.persistence.internal.indirection.UnitOfWorkQueryValueHolder#90312
owner in org.eclipse.persistence.internal.descriptors.changetracking.AttributeChangeListener#26713
relationshipSourceObject in org.eclipse.persistence.internal.indirection.UnitOfWorkQueryValueHolder#90312
owner in org.eclipse.persistence.internal.descriptors.changetracking.AttributeChangeListener#26713
, ...)None of this helped:这些都没有帮助:
In my understanding WEAK should be enough to prevent eclipselink from storing references for too long and prevent GC.在我的理解 WEAK 应该足以防止 eclipselink 将引用存储太长时间并防止 GC。 But it is stored somewhere anyway and since that references are accessible from GC roots they are newer cleared.但它无论如何都存储在某个地方,并且由于可以从 GC 根访问该引用,因此它们被更新清除。 Can anyone explain this behavior or point me at direction where to look?任何人都可以解释这种行为或指出我该看的方向吗?
Addressing comment and Chris answer.处理评论和克里斯回答。 More information about how we use EM and transactions.有关我们如何使用 EM 和交易的更多信息。
We are detaching using EntityManager.detach method and references ( @OneToMany
, @ManyToMany
, etc) have Cascade.DETACH applied.我们正在使用 EntityManager.detach 方法进行分离,并且引用( @OneToMany
、 @ManyToMany
等)应用了 Cascade.DETACH。 Loading necessary lazy loaded references is done prior to detach.加载必要的延迟加载引用是在分离之前完成的。
I agree about the part about re-fetching entities.我同意关于重新获取实体的部分。 I would not mind having multiple instances of the same entity in memory for some time.我不介意在内存中存储同一实体的多个实例一段时间。 My problem is why it is not garbage collected.我的问题是为什么它没有被垃圾收集。
List of entities in sample code is loaded in one transaction on subsequent database UPDATE or DELETE (this also fetches some bits into memory creating more instances) is another transaction per entity.示例代码中的实体列表在后续数据库 UPDATE 或 DELETE 的一个事务中加载(这也会将一些位提取到内存中以创建更多实例)是每个实体的另一个事务。 I would probably expect most of the heap used during the initial call and then slowly clearing or remaining roughly same.我可能希望在初始调用期间使用的大部分堆,然后慢慢清除或保持大致相同。
About using EntityManager关于使用 EntityManager
We are using wildfly as JakartaEE container.我们使用wildfly 作为JakartaEE 容器。 By default it is shipped with hibernate as JPA provider but we have added eclipselink as module and configured provider in persistence.xml默认情况下,它与 hibernate 作为 JPA 提供程序一起提供,但我们已将 eclipselink 添加为模块并在 persistence.xml 中配置了提供程序
According to documentation container managed EntityManager creates instances as needed.根据文档容器管理的 EntityManager 根据需要创建实例。
Are you caching entities?你在缓存实体吗? Clear is not enough to allow you to effectively cache, as if that is what you are trying, is likely related to your current issue. Clear 不足以让您有效地缓存,好像这就是您正在尝试的,很可能与您当前的问题有关。 Everything loaded from a EntityManager has are reference to that EntityManager, so I would guess that you are reading in a large list of entities that are partially fetched and caching them, then using EntityManager.clear() to try to detach them.从 EntityManager 加载的所有内容都是对该 EntityManager 的引用,因此我猜您正在读取部分获取并缓存它们的大型实体列表,然后使用 EntityManager.clear() 尝试分离它们。
Those entities are then no longer 'managed' but still reference the EntityManager.这些实体不再是“托管”的,但仍然引用 EntityManager。 As soon as you fetch something, such as the entity.getItems() call you've shown in code, assuming this is a standard OneToMany with a back pointer which defaults to be lazily loaded, this will force fetching all 'items' into memory.一旦你获取一些东西,比如你在代码中显示的 entity.getItems() 调用,假设这是一个标准的 OneToMany,带有默认为延迟加载的后向指针,这将强制将所有“项目”获取到内存中. As they have a back reference and 'this' entity isn't referenced by the EntityManager, the Item then has to refetch the entity.由于它们具有反向引用,并且 EntityManager 未引用“this”实体,因此 Item 必须重新获取实体。 So you now have two instances of the same Entity in memory Entity1' -> Item1 -> Entity1.因此,您现在在内存 Entity1' -> Item1 -> Entity1 中有同一实体的两个实例。
This can easily build up with more complex object graph and repeated clear calls.这可以通过更复杂的对象图和重复的清除调用轻松构建。
This can be, not solved, but the overhead reduced by reducing the scope of what you do in an EntityManager, so that it can be reused for identity purposes related to that object graph, and garbage collected (and cleared by GC) when objects it was used to read are also cleared by GC.这可以,不能解决,但是通过减少你在 EntityManager 中所做的工作的范围来减少开销,以便它可以被重用于与该对象图相关的标识目的,并在对象它时进行垃圾收集(并由 GC 清除)用于读取也被GC清除。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.