简体   繁体   English

如何使用Lucene.net实现自定义过滤器?

[英]How do you implement a custom filter with Lucene.net?

The code below is from the Lucene In Action book (originally in Java). 下面的代码来自Lucene In Action一书(最初是Java)。 It's for building a list of 'allowed' documents (from a user permission point of view) to filter search results with. 它用于构建“允许”文档列表(从用户权限的角度来看)以过滤搜索结果。 The problem is the termsDocs.Read() method does not accept the 'doc' and 'freq' arrays to be passed by reference, so they're still empty when it comes to setting the bit in the bit array. 问题是termsDocs.Read()方法不接受通过引用传递的'doc'和'freq'数组,因此在设置位数组中的位时它们仍然是空的。

Can anyone help, examples of using Lucene custom filters (especially in .net) seem to be thin on the ground. 任何人都可以提供帮助,使用Lucene自定义过滤器(尤其是.net)的例子似乎很薄。 Thanks. 谢谢。

public class LuceneCustomFilter : Lucene.Net.Search.Filter
{
    string[] _luceneIds;

    public LuceneCustomFilter(string[] luceneIds)
    {
        _luceneIds = luceneIds;
    }

    public override BitArray Bits(Lucene.Net.Index.IndexReader indexReader)
    {
        BitArray bitarray = new BitArray(indexReader.MaxDoc());

        int[] docs = new int[1];
        int[] freq = new int[1];

        for (int i = 0; i < _luceneIds.Length; i++)
        {
            if (!string.IsNullOrEmpty(_luceneIds[i]))
            {
                Lucene.Net.Index.TermDocs termDocs = indexReader.TermDocs(
                    new Lucene.Net.Index.Term(@"luceneId", _luceneIds[i]));

                int count = termDocs.Read(docs, freq);

                if (count == 1)
                {
                    bitarray.Set(docs[0], true);
                }
            }
        }

        return bitarray;
    }
}

I'm using Lucene.net 2.0.0.4, but the TermDocs interface still appears to be the same in the latest branch here: https://svn.apache.org/repos/asf/incubator/lucene.net/trunk/C%23/src/Lucene.Net/Index/TermDocs.cs 我正在使用Lucene.net 2.0.0.4,但TermDocs界面在最新的分支中看起来仍然相同: https ://svn.apache.org/repos/asf/incubator/lucene.net/trunk/C %23 / src目录/ Lucene.Net /索引/ TermDocs.cs

Here's a working example of Lucene.NET using a custom filter you might take a look at: 以下是使用自定义过滤器的Lucene.NET的工作示例,您可以查看:

using System;
using System.Collections;
using Lucene.Net.Analysis;
using Lucene.Net.Documents;
using Lucene.Net.Index;
using Lucene.Net.Search;
using Lucene.Net.Store;

class Program
{
    static void Main(string[] args)
    {
        Directory index = new RAMDirectory();
        Analyzer analyzer = new KeywordAnalyzer();
        IndexWriter writer = new IndexWriter(index, analyzer, true);

        Document doc = new Document();
        doc.Add(new Field("title", "t1", Field.Store.YES, 
            Field.Index.TOKENIZED));
        writer.AddDocument(doc);
        doc = new Document();
        doc.Add(new Field("title", "t2", Field.Store.YES, 
            Field.Index.TOKENIZED));
        writer.AddDocument(doc);

        writer.Close();

        Searcher searcher = new IndexSearcher(index);
        Query query = new MatchAllDocsQuery();
        Filter filter = new LuceneCustomFilter();
        Sort sort = new Sort("title", true);
        Hits hits = searcher.Search(query, filter, sort);
        IEnumerator hitsEnumerator = hits.Iterator();

        while (hitsEnumerator.MoveNext())
        {
            Hit hit = (Hit)hitsEnumerator.Current;
            Console.WriteLine(hit.GetDocument().GetField("title").
                StringValue());
        }
    }
}

public class LuceneCustomFilter : Filter
{
    public override BitArray Bits(IndexReader indexReader)
    {
        BitArray bitarray = new BitArray(indexReader.MaxDoc());

        int[] docs = new int[1];
        int[] freq = new int[1];

        TermDocs termDocs = indexReader.TermDocs(
                new Term(@"title", "t1"));

        int count = termDocs.Read(docs, freq);
        if (count == 1)
        {
            bitarray.Set(docs[0], true);
        }
        return bitarray;
    }
}

A bit confused here because passing an array does in fact pass it by reference. 这里有点困惑,因为传递数组确实通过引用传递它。 For instance the following blurb will print 10 10 10 10 10 showing that the array values have been updated. 例如,下面的blurb将打印10 10 10 10 10,表明数组值已更新。

Am I missing something here? 我在这里错过了什么吗?

    public void TestPassing()
    {
        int[] stuff = new int[] {5, 5, 5, 5};

        Add(stuff, 5);
        for (int i = 0; i < stuff.Length; i++)
        {
            Console.Write(stuff[i]);
        }
    }

    public void Add(int[] stuff, int x)
    {
        for(int i = 0; i < stuff.Length; i++)
        {
            stuff[i] = stuff[i] + x;
        }
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM