简体   繁体   English

ASP.NET Lucene性能改进问题

[英]ASP.NET Lucene Performance Improvements question

I have coded up an ASP.NET website and running on win'08 (remotely hosted). 我已经编写了一个ASP.NET网站并在win'08(远程托管)上运行。 The application queries 11 very large Lucene indexes (each ~100GB). 该应用程序查询11个非常大的Lucene索引(每个索引约100GB)。 I open IndexSearchers on Page_load() and keep them open for the duration of the user session. 我在Page_load()上打开IndexSearchers,并在用户会话期间保持打开状态。

My questions: 我的问题:

  1. The queries take a ~5 seconds to complete - understandable these are very large indexes - but users want faster responses. 这些查询大约需要5秒钟才能完成-可以理解,它们是非常大的索引-但是用户希望响应速度更快。 I was curious to squeeze out better performance. 我很好奇要挤出更好的性能。 ( I did look over the Apache Lucene website and try some of the ideas over there). (我确实查看了Apache Lucene网站,并在那里尝试了一些想法)。 Interested in if & how you tweaked it further, especially ones from asp.net perspective. 对是否以及如何进行进一步调整很感兴趣,尤其是从asp.net角度来看。

  2. One ideas was to use Solr instead of querying Lucene directly. 一种想法是使用Solr而不是直接查询Lucene。 But that seems counter-intuitive, introducing another abstraction in between and might add to the latency. 但这似乎违反直觉,在两者之间引入了另一种抽象,可能会增加延迟。 Is it worth the headache in porting to Solr? 移植到Solr值得头痛吗? Can anyone share some metrics on what improvement you got following a switch to Solr if it has been worth it. 如果值得的话,任何人都可以分享一些指标,以了解改用Solr后您获得了哪些改进。

  3. Are there some key things that could be done in Solr that could be replicated to speed up response times? Solr中是否可以复制一些关键的东西以加快响应速度?

Some questions / ideas: 一些问题/想法:

  • Are you hitting all 11 indexes for a single request? 您是否要为单个请求击中所有 11个索引?
  • Can you reorganize the indexes so that you hit only 1 index (ie sharding) ? 您可以重新组织索引,以便仅命中1个索引(即分片)吗?
  • Have you run a profile of the application (using dotTrace or similar tool)? 您是否运行过应用程序的配置文件(使用dotTrace或类似工具)? Where is the time spent? 时间在哪里? Lucene.Net? Lucene.Net?
  • If most of the time is spent on Lucene.Net, then if you migrate to Solr the latency should be negligible (compared to the rest of the spent time). 如果大部分时间都花在Lucene.Net上,那么如果您迁移到Solr,则等待时间应该可以忽略不计(与剩余时间相比)。 Plus, Solr can be easily distributed to increase performance. 另外, 可以轻松分发Solr以提高性能。
  • I'm not all too familiar with Lucene (I use Solr) but if you're searching 11 indexes per request, can you run those searches in parallel (eg with TPL ) ? 我对Lucene不太熟悉(我使用Solr),但是如果您要在每个请求中搜索11个索引,可以并行运行这些搜索(例如,使用TPL )吗?

The biggest thing is removing the search from the web tier, and isolating it to it's own tier (a search tier). 最大的事情是从网络层中删除搜索,并将其隔离到它自己的层(搜索层)中。 That way, you have a dedicated box with dedicated resources that have the indexes loaded, and "warmed up" in cache, instead of having each user have a copy of it's own index reader. 这样,您将拥有一个专用框,该框具有专用资源,这些资源已加载索引并在缓存中“预热”,而不是让每个用户都有其自己的索引读取器的副本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM