简体   繁体   English

lucene 不区分大小写的排序搜索

[英]lucene case Insensitive sort search

How to search by multifield sort with case insensitive mode?如何通过不区分大小写模式的多字段排序进行搜索?

I am using lucene 4.10.4 version and doing sort with multifiled sort as我正在使用 lucene 4.10.4 版本并使用多文件排序进行排序

SortField[] sortFiled = new SortField[2];
sortFiled[0] = new SortField("name", SortField.Type.STRING);
sortFiled[1] = new SortField("country", SortField.Type.STRING);

TopDocs topDocs = indexSearcher.search(query, 10 , new Sort(sortFiled));

It gives sort result but in case sensitive mode.它给出排序结果,但在区分大小写的模式下。 I want it to sort in case insensitive mode.我希望它在不区分大小写的模式下排序。

SortField[] sortFiled = new SortField[2];
sortFiled[0] = new SortField("name", SortField.Type.STRING);
sortFiled[1] = new SortField("country", CaseInsensitiveStringComparator());

Use custome filedCompartorSource in SortField for sortfield type. 在SortField中使用custome filedCompartorSource获取sortfield类型。 In above code we are sorting country field in case insensitive mode. 在上面的代码中,我们在不区分大小写模式下对country字段进行排序。 see the below custom FieldComparatorSource class 请参阅下面的自定义FieldComparatorSource类

class CaseInsensitiveStringComparator extends FieldComparatorSource{

@Override
public FieldComparator<String> newComparator(String arg0, int arg1, int arg2,
        boolean arg3) throws IOException {
    return new CaseIgonreCompare(arg0, arg1);
}
}



class CaseIgonreCompare extends FieldComparator<String>{

private String field;
private String bottom;
private String topValue;
private BinaryDocValues cache;
private String[] values;

public CaseIgonreCompare(String field, int numHits) {
    this.field = field;
    this.values = new String[numHits];
}

@Override
public int compare(int arg0, int arg1) {
    return compareValues(values[arg0], values[arg1]);
}

@Override
public int compareBottom(int arg0) throws IOException {
    return compareValues(bottom, cache.get(arg0).utf8ToString());
}

@Override
public int compareTop(int arg0) throws IOException {
    return compareValues(topValue, cache.get(arg0).utf8ToString());
}

public int compareValues(String first, String second) {
    int val = first.length() - second.length();
    return val == 0 ? first.compareToIgnoreCase(second) : val;
};

@Override
public void copy(int arg0, int arg1) throws IOException {
   values[arg0] = cache.get(arg1).utf8ToString();
}

@Override
public void setBottom(int arg0) {
    this.bottom  = values[arg0];
}

@Override
public FieldComparator<String> setNextReader(AtomicReaderContext arg0)
        throws IOException {
    this.cache = FieldCache.DEFAULT.getTerms(arg0.reader(), 
            field  , true);
    return this;
}

@Override
public void setTopValue(String arg0) {
    this.topValue = arg0;
}

@Override
public String value(int arg0) {
    return values[arg0];
}

} }

I needed to order string fields by the Icelandic alphabet rules (aábcdðeé....) so I tried porting the code to c# and using the StringComparer.InvariantCultureIgnoreCase comparer.我需要按照冰岛字母规则 (aábcdðeé....) 对字符串字段进行排序,所以我尝试将代码移植到 c# 并使用 StringComparer.InvariantCultureIgnoreCase 比较器。 And it work perfectly.它完美地工作。

So, here is ac# port of Birbal Singh's code所以,这是 Birbal Singh 代码的 ac# 端口

CaseInsensitiveStringComparator.cs CaseInsensitiveStringComparator.cs

public class CaseInsensitiveStringComparator : FieldComparerSource
{
    public override FieldComparer NewComparer(string fieldname, int numHits, int sortPos, bool reversed)
    {
        return new CaseIgonreCompare(fieldname, numHits);
    }
}

CaseIgonreCompare.cs CaseIgonreCompare.cs

public class CaseIgonreCompare : FieldComparer<string>
{
    private string _field;
    private string[] _values;       
    private BinaryDocValues _cache;
    private string _bottom; 
    private string _topValue;

    public CaseIgonreCompare(string field, int numHits)
    {
        _field = field;
        _values = new string[numHits];
    }

    public override IComparable this[int slot] => _values[slot];

    public override int CompareValues(string first, string second)
    {
        int val = first.Length - second.Length;
        return StringComparer.InvariantCultureIgnoreCase.Compare(first, second);
    }

    private string GetValue(int doc)
    {
        var bytesRef = new BytesRef();
        _cache.Get(doc, bytesRef);
        return bytesRef.Utf8ToString();
    }

    public override int Compare(int slot1, int slot2)
    {
        return string.Compare(_values[slot1], _values[slot2]);
    }

    public override int CompareBottom(int doc)
    {
        return CompareValues(_bottom, GetValue(doc));
    }

    public override int CompareTop(int doc)
    {
        return CompareValues(_topValue, GetValue(doc));
    }

    public override void Copy(int slot, int doc)
    {
        _values[slot] = GetValue(doc);
    }

    public override void SetBottom(int slot)
    {
        _bottom = _values[slot];
    }

    public override FieldComparer SetNextReader(AtomicReaderContext context)
    {
        _cache = FieldCache.DEFAULT.GetTerms(context.AtomicReader, _field, true);

        return this;
    }

    public override void SetTopValue(object value)
    {
        _topValue = value as string;
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM