[英]Regarding keyword search and lucene.net c#
i am just browsing article on lucene.net. 我只是在lucene.net上浏览文章。 i got some sample code for create index using lucene.net and few lines of code is not clear to me.
我得到了一些使用lucene.net创建索引的示例代码,我不清楚几行代码。 here is those line
这是那条线
protected void btnCreateIndex_Click(object sender, EventArgs e)
{
IndexWriter writer = new IndexWriter(MapPath("~/searchlucene/"), new StandardAnalyzer(), false);
IndexDocument(writer, "About Hockey", "hockey", "Hockey is a cool sport which I really like, bla bla");
IndexDocument(writer, "Some great players", "hockey", "Some of the great players from Sweden - well Peter Forsberg, Mats Sunding, Henrik Zetterberg");
IndexDocument(writer, "Soccer info", "soccer", "Soccer might not be as fun as hockey but it's also pretty fun");
IndexDocument(writer, "Players", "soccer", "From Sweden we have Zlatan Ibrahimovic and Henrik Larsson. They are the most well known soccer players");
IndexDocument(writer, "1994", "soccer", "I remember World Cup 1994 when Sweden took the bronze. we had great players. players , bla bla");
IndexDocument(writer, "BBA-header", "BBA-321type", "Hello BBA");
writer.Optimize();
writer.Close();
}
private void IndexDocument(IndexWriter writer, string sHeader, string sType, string sContent)
{
Document doc = new Document();
doc.Add(new Field("header", sHeader, Field.Store.YES, Field.Index.TOKENIZED));
doc.Add(new Field("type", sType, Field.Store.YES, Field.Index.TOKENIZED));
doc.Add(new Field("content", sContent, Field.Store.YES, Field.Index.TOKENIZED));
writer.AddDocument(doc);
}
1) doc.Add(new Field("header", sHeader, Field.Store.YES, Field.Index.TOKENIZED)); 1)doc.Add(new Field(“ header”,sHeader,Field.Store.YES,Field.Index.TOKENIZED)); what is the meaning of this line.
这条线是什么意思。 Field.Index.TOKENIZED what is TOKENIZED & UNTOKENIZED??
Field.Index.TOKENIZED是什么? when i search keyword specified in type argument then nothing is coming.
当我搜索在类型实参中指定的关键字时,没有任何反应。 just do not understand the behaviour
只是不了解行为
here is sample for search where i specify a keyword which was index as type 这是用于搜索的示例,其中我指定了作为类型索引的关键字
ListBox1.Items.Clear();
var searcher = new Lucene.Net.Search.IndexSearcher(MapPath("~/searchlucene/"));
var oParser = new Lucene.Net.QueryParsers.QueryParser("content", new StandardAnalyzer());
string sHeader = " OR (header:" + TextBox1.Text + ")";
string sType = " OR (type:" + TextBox1.Text + ")";
string sSearchQuery = "(" + TextBox1.Text + sHeader + sType + ")";
var oHitColl = searcher.Search(oParser.Parse(sSearchQuery));
for (int i = 0; i < oHitColl.Length(); i++)
{
Document oDoc = oHitColl.Doc(i);
ListBox1.Items.Add(new ListItem(oDoc.Get("header") + oDoc.Get("type") + oDoc.Get("content")));
}
searcher.Close();
please someone help me to understand to drive out my confusion. 请有人帮助我理解以消除混乱。 thanks
谢谢
I just tested your code, and it works fine with Lucene 2.9.4. 我刚刚测试了您的代码,它在Lucene 2.9.4中可以正常工作。
Field.Index.TOKENIZED
means the Analyzer will break your text in tokens, meaning it will be searchable in full-text. Field.Index.TOKENIZED
意味着分析器将用令牌将您的文本打断,这意味着可以全文搜索。 You would use UN_TOKENIZED
for fields you dont want analyzed, like product IDs. 您可以将
UN_TOKENIZED
用于您不想分析的字段,例如产品ID。
Note: you should use Field.Index.ANALYZED
and Field.Index.NOT_ANALYZED
which are the replacements for their deprecrated TOKENIZED
/ UN_TOKENIZED
counterparts. 注意:您应该使用
Field.Index.ANALYZED
和Field.Index.NOT_ANALYZED
来替代已废弃的TOKENIZED
/ UN_TOKENIZED
副本。
To see differences between analyzed and not, you can try both and use Luke to inspect your indexes, that will probably give you a good idea of how it works. 要查看已分析和未分析之间的差异,您可以尝试两者并使用Luke来检查索引,这可能会使您很好地了解其工作原理。
http://code.google.com/p/luke/ http://code.google.com/p/luke/
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.