简体   繁体   English

Lucene多次打开文件时内存不足

[英]Lucene out of memory while opening file multiple time

My application get multiple request per second we have bot's crawling our site. 我的应用程序每秒收到多个请求,而我们的漫游器正在抓取我们的网站。 I use Lucene for Indexing and searching. 我使用Lucene进行索引和搜索。 For the 1st request when the site is restart application opens the Lucene indexed file and store it. 对于站点重新启动时的第一个请求,应用程序打开Lucene索引文件并将其存储。 So from second request it will look into the stored object. 因此,从第二个请求开始,它将查看存储的对象。 But the issue is till the file is completely open and store there are multiple request which will try to open the file again. 但是问题是直到文件完全打开并存储,然后有多个请求才会尝试再次打开文件。 This causes the site to go out of memory after 5-10 minutes. 这将导致站点在5-10分钟后耗尽内存。

This are the following errors. 这是以下错误。

java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.util.TreeMap.put(Unknown Source)
    at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:61)
    at org.apache.lucene.codecs.lucene42.Lucene42FieldInfosReader.read(Lucene42FieldInfosReader.java:96)
    at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:121)
    at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:56)
    at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:62)
    at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:783)
    at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
    at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
    at com.webjaguar.web.frontend.LuceneCategery.getLuceneProduct(LuceneCategery.java:166)
    at com.webjaguar.web.frontend.CategoryController.handleRequest(CategoryController.java:1034)
    at org.springframework.web.servlet.mvc.SimpleControllerHandlerAdapter.handle(SimpleControllerHandlerAdapter.java:48)
    at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923)
    at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)
    at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882)
    at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:778)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:624)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:731)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
    at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:312)
    at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:116)
    at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:83)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:324)
    at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:113)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:324)
    at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:113)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:324)
    at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:54)

SECOND ERROR 第二次错误

   Exception in thread "Lucene Merge Thread #9" org.apache.lucene.index.MergePolicy$MergeException: java.lang.OutOfMemoryError: Java heap space
    at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:541)
    at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:514)
Caused by: java.lang.OutOfMemoryError: Java heap space
java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot commit
    at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2661)
    at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2827)
    at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:981)
    at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:883)
    at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:845)
    at com.webjaguar.thirdparty.lucene.LuceneProductIndexer.reIndex(LuceneProductIndexer.java:750)
    at com.webjaguar.web.quartz.LuceneProductJob.autoIndex(LuceneProductJob.java:90)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.springframework.util.MethodInvoker.invoke(MethodInvoker.java:273)
    at org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean$MethodInvokingJob.executeInternal(MethodInvokingJobDetailFactoryBean.java:311)
    at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:113)
    at org.quartz.core.JobRunShell.run(JobRunShell.java:223)
    at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549)

THIS LINE IS THE ERROR LINE 这条线是错误线

reader = DirectoryReader.open(NIOFSDirectory.open(indexFile));

Is there a way to lock the file till it is store. 有没有一种方法可以锁定文件直到将其存储。 Any solution to improve the way it should be implemented 任何改善实施方式的解决方案

You should have a look at the LockFactory of the NIOFSDirectory (inherited from parent Directory ). 您应该看看LockFactoryNIOFSDirectory (从父Directory继承)。 See LockFactory Javadoc for little more informations 有关更多信息,请参见LockFactory Javadoc。

Additional to this, your requirements look like a NRT (near-real-time) use case for me. 除此之外,您的需求对我来说就像是一个NRT(近实时)用例。 If you like to index and search within a short time period and indexing will be done continuous a NRT implementation would make sense. 如果您希望在短时间内进行索引和搜索,并且将连续进行索引,则可以使用NRT。 I'm not sure if this is already a feature of lucene v4.2. 我不确定这是否已经是Lucene v4.2的功能。 See Simple NRT tutorial for additional information. 有关其他信息,请参见Simple NRT教程

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM