RavenDB快速子字符串搜索

Question

I have perhaps trillions of string sequences. 我可能有数万亿个字符串序列。 I'm looking for a fast substring search. 我正在寻找快速子字符串搜索。

I've created an index. 我创建了一个索引。 When I am trying to get some results( x => x.StartWith ), it takes about 2 sec on a 3 million object database. 当我试图得到一些结果（ x => x.StartWith ）时，在300万个对象数据库上花费大约2秒。

How much time it might take on 500 million objects? 5亿个物体需要多长时间？

Is it possible to have RavenDB search faster? 是否可以更快地进行RavenDB搜索？

 store.DatabaseCommands.PutIndex("KeyPhraseInfoByWord", new Raven.Client.Indexes.IndexDefinitionBuilder<KeyPhraseInfo>
   {
    Map = wordStats => from keyPhraseInfo in keyPhraseInfoCollection 
                   select new { keyPhraseInfo.Key },
    Analyzers =
        {
            { x => x.Key, "SimpleAnalyzer"}
        }
    });

Answer 1

Nier0, You can do really fast NGram search using RavenDB, yes. Nier0，你可以使用RavenDB进行非常快速的NGram搜索，是的。 See: https://gist.github.com/1669767 请参阅： https ： //gist.github.com/1669767

Answer 2

Ayende's excellent NGram analyzer seems to be made for an older version of Lucene than RavenDB uses now, so I made an updated version of it for confused people like me. Ayende出色的NGram分析器似乎是为RaceDB的旧版Lucene而制作的，所以我为像我这样迷茫的人制作了它的更新版本。 See: http://pastebin.com/a78XzGDk . 请参阅： http ： //pastebin.com/a78XzGDk 。 All credit goes to Ayende for this one. 所有功劳都归功于Ayende。

To use it, put it in a library, build it and drop it into the Analyzers-folder under Server in the RavenDB directory. 要使用它，请将其放入库中，构建它并将其放入RavenDB目录中Server下的Analyzers-folder中。 Then create an index like this: 然后创建一个这样的索引：

public class PostByNameIndex : AbstractIndexCreationTask<Posts>
{
    public PostByNameIndex()
    {
        Map = posts => posts.Select(x => new {x.Name});
        Analyze(x => x.Name, typeof(NGramAnalyzer).AssemblyQualifiedName);
     }
}

But as I said, all credit and thanks to Ayende for creating this. 但正如我所说，所有的信任和感谢艾恩德创造了这一点。

RavenDB快速子字符串搜索

问题描述

2 个解决方案

解决方案1
12 已采纳 2012-05-29 13:58:42

解决方案2
8 2013-02-28 12:36:35

RavenDB快速子字符串搜索

问题描述

2 个解决方案

解决方案1 12 已采纳 2012-05-29 13:58:42

解决方案2 8 2013-02-28 12:36:35

解决方案1
12 已采纳 2012-05-29 13:58:42

解决方案2
8 2013-02-28 12:36:35