简体   繁体   中英

ASP.NET Lucene Performance Improvements question

I have coded up an ASP.NET website and running on win'08 (remotely hosted). The application queries 11 very large Lucene indexes (each ~100GB). I open IndexSearchers on Page_load() and keep them open for the duration of the user session.

My questions:

  1. The queries take a ~5 seconds to complete - understandable these are very large indexes - but users want faster responses. I was curious to squeeze out better performance. ( I did look over the Apache Lucene website and try some of the ideas over there). Interested in if & how you tweaked it further, especially ones from asp.net perspective.

  2. One ideas was to use Solr instead of querying Lucene directly. But that seems counter-intuitive, introducing another abstraction in between and might add to the latency. Is it worth the headache in porting to Solr? Can anyone share some metrics on what improvement you got following a switch to Solr if it has been worth it.

  3. Are there some key things that could be done in Solr that could be replicated to speed up response times?

Some questions / ideas:

  • Are you hitting all 11 indexes for a single request?
  • Can you reorganize the indexes so that you hit only 1 index (ie sharding) ?
  • Have you run a profile of the application (using dotTrace or similar tool)? Where is the time spent? Lucene.Net?
  • If most of the time is spent on Lucene.Net, then if you migrate to Solr the latency should be negligible (compared to the rest of the spent time). Plus, Solr can be easily distributed to increase performance.
  • I'm not all too familiar with Lucene (I use Solr) but if you're searching 11 indexes per request, can you run those searches in parallel (eg with TPL ) ?

The biggest thing is removing the search from the web tier, and isolating it to it's own tier (a search tier). That way, you have a dedicated box with dedicated resources that have the indexes loaded, and "warmed up" in cache, instead of having each user have a copy of it's own index reader.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM