简体   繁体   English

在ASP.NET MVC站点中正确构建Lucene.Net用法

[英]Proper structuring of Lucene.Net usage in an ASP.NET MVC site

I'm building an ASP.NET MVC site where I plan to use Lucene.Net. 我正在构建一个ASP.NET MVC站点,我计划使用Lucene.Net。 I've envisioned a way to structure the usage of Lucene, but not sure whether my planned architecture is OK and efficient. 我已经设想了一种构建Lucene使用方法的方法,但不确定我的计划架构是否正常且高效。


My Plan: 我的计划:

  • On Application_Start event in Global.asax: I check for the existence of the index on the file system - if it doesn't exist, I create it and fill it with documents extracted it from the database. 在Global.asax中的Application_Start事件:我检查文件系统上是否存在索引 - 如果它不存在,我创建它并用从数据库中提取的文档填充它。
  • When new content is submitted: I create an IndexWriter , fill up a document, write to the index, and finally dispose of the IndexWriter . 提交新内容时:我创建一个IndexWriter ,填充文档,写入索引,最后处理IndexWriter IndexWriters are not reused, as I can't imagine a good way to do that in an ASP.NET MVC application. IndexWriters没有被重用,因为我无法想象在ASP.NET MVC应用程序中这样做的好方法。
  • When content is edited: I repeat the same process as when new content is submitted, except that I first delete the old content and then add the edits. 编辑内容时:我重复与提交新内容时相同的过程,但我先删除旧内容然后添加编辑。
  • When a user searches for content: I check HttpRuntime.Cache to see if a user has already searched for this term in the last 5 minutes - if they have, I return those results; 当用户搜索内容时:我检查HttpRuntime.Cache以查看用户是否已在过去5分钟内搜索过该术语 - 如果有,我会返回这些结果; otherwise, I create an IndexReader , build and run a query, put the results in HttpRuntime.Cache , return them to the user, and finally dispose of the IndexReader . 否则,我创建一个IndexReader ,构建并运行一个查询,将结果放入HttpRuntime.Cache ,将它们返回给用户,最后处理IndexReader Once again, IndexReaders aren't reused. 再一次, IndexReaders不会被重用。

My Questions: 我的问题:

  • Is that a good structure - how can I improve it? 这是一个很好的结构 - 我怎样才能改进它?
  • Are there any performance/efficiency problems I should be aware of? 我应该注意哪些性能/效率问题
  • Also, is not reusing the IndexReaders and IndexWriters a huge code smell? 另外,是不是重复使用IndexReaders和IndexWriters一个巨大的代码味道?

The answer to all three of your questions is the same: reuse your readers (and possibly your writers). 所有三个问题的答案都是一样的:重复使用读者(可能还有作者)。 You can use a singleton pattern to do this (ie declare your reader/writer as public static). 您可以使用单例模式执行此操作(即将您的读/写器声明为公共静态)。 Lucene's FAQ tells you the same thing: share your readers, because the first query is reaaalllyyyy slow. Lucene的常见问题解答告诉你同样的事情:分享你的读者,因为第一个查询是非常缓慢的。 Lucene handles all the locking for you, so there is really no reason why you shouldn't have a shared reader. Lucene为您处理所有锁定,因此您没有理由不拥有共享阅读器。

It's probably easiest to just keep your writer around and (using the NRT model) get the readers from that. 最简单的方法就是保持你的作家和(使用NRT模型)从中获取读者。 If it's rare that you are writing to the index, or if you don't have a huge need for speed, then it's probably OK to open your writer each time instead. 如果你很少写入索引,或者你对速度没有太大的需求,那么每次打开你的作家都可以。 That is what I do. 这就是我做的。

Edit: added a code sample: 编辑:添加了代码示例:

public static IndexWriter writer = new IndexWriter(myDir);

public JsonResult SearchForStuff(string query)
{
    IndexReader reader = writer.GetReader();
    IndexSearcher search = new IndexSearcher(reader);
    // do the search
}

I would probably skip the caching -- Lucene is very, very efficent. 我可能会跳过缓存 - Lucene非常非常高效。 Perhaps so efficent that it is faster to search again than cache. 也许是如此高效,以至于再次搜索比缓存更快。

The OnApplication_Start full index feels a bit off to me -- should probably be run in it's own thread so as not to block other expensive startup activities. OnApplication_Start完整索引对我来说有点不合适 - 应该在它自己的线程中运行,以免阻止其他昂贵的启动活动。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM