简体   繁体   English

ElasticSearch Nest 2.x 索引和搜索嵌套对象

[英]ElasticSearch Nest 2.x Indexing and Searching Nested Objects

I'm having trouble figuring out how to index and search nested object.我在弄清楚如何索引和搜索嵌套对象时遇到了麻烦。

I want to be able to search nested objects and return the parents - only the parents, without the list of Remarks, but I would like highlights from the remarks returned if possible.我希望能够搜索嵌套对象并返回父对象 - 只有父对象,没有备注列表,但如果可能的话,我希望从返回的备注中突出显示。

My models:我的模型:

[DataContract]
[ElasticsearchType(IdProperty = "CustomerId", Name = "CustomerSearchResult")]
public class SearchResult
{
    [DataMember]
    [String(Index = FieldIndexOption.NotAnalyzed)]
    public int CustomerId { get; set; }
    ...

    [Nested]
    [DataMember]
    public List<RemarkForSearch> Remarks { get; set; }
}

[ElasticsearchType(IdProperty = "RemarkId", Name = "RemarkForSearch")]
public class RemarkForSearch
{
    [DataMember]
    public int RemarkId { get; set; }

    [DataMember]
    public int CustomerId { get; set; }

    [DataMember]
    public string RemarkText { get; set; }
}

Index creation:索引创建:

var customerSearchIdxDesc = new CreateIndexDescriptor(Constants.ElasticSearch.CustomerSearchIndexName)
    .Settings(f =>
        f.Analysis(analysis => analysis
                .CharFilters(cf => cf
                    .PatternReplace(Constants.ElasticSearch.FilterNames.RemoveNonAlphaNumeric, pr => pr
                        .Pattern(@"[^a-zA-Z\d]") // match all non alpha numeric
                        .Replacement(string.Empty)
                    )
                )
               .TokenFilters(tf => tf
                    .NGram(Constants.ElasticSearch.FilterNames.NGramFilter, fs => fs
                        .MinGram(1)
                        .MaxGram(20)
                    )
                )
                .Analyzers(analyzers => analyzers
                    .Custom(Constants.ElasticSearch.AnalyzerNames.NGramAnalyzer, a => a
                        .Filters("lowercase", "asciifolding", Constants.ElasticSearch.FilterNames.NGramFilter)
                        .Tokenizer(Constants.ElasticSearch.TokenizerNames.WhitespaceTokenizer)
                    )
                    .Custom(Constants.ElasticSearch.AnalyzerNames.WhitespaceAnalyzer, a => a
                        .Filters("lowercase", "asciifolding")
                        .Tokenizer(Constants.ElasticSearch.TokenizerNames.WhitespaceTokenizer)
                    )
                    .Custom(Constants.ElasticSearch.AnalyzerNames.FuzzyAnalyzer, a => a
                        .Filters("lowercase", "asciifolding")
                        //.CharFilters(Constants.ElasticSearch.FilterNames.RemoveNonAlphaNumeric)
                        .Tokenizer(Constants.ElasticSearch.TokenizerNames.NGramTokenizer)
                    )
                )
                .Tokenizers(tokenizers => tokenizers
                    .NGram(Constants.ElasticSearch.TokenizerNames.NGramTokenizer, t => t
                        .MinGram(1)
                        .MaxGram(20)
                        //.TokenChars(TokenChar.Letter, TokenChar.Digit)
                    )

                    .Whitespace(Constants.ElasticSearch.TokenizerNames.WhitespaceTokenizer)
                )
        )
    )
    .Mappings(ms => ms
        .Map<ServiceModel.DtoTypes.Customer.SearchResult>(m => m
            .AutoMap()
            .AllField(s => s
                .Analyzer(Constants.ElasticSearch.AnalyzerNames.NGramAnalyzer)
                .SearchAnalyzer(Constants.ElasticSearch.AnalyzerNames.WhitespaceAnalyzer)
            )
            .Properties(p => p
                .String(n => n
                    .Name(c => c.ContactName)
                    .Index(FieldIndexOption.NotAnalyzed)
                    .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                )
                .String(n => n
                    .Name(c => c.CustomerName)
                    .Index(FieldIndexOption.NotAnalyzed)
                    .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                )
                .String(n => n
                    .Name(c => c.City)
                    .Index(FieldIndexOption.NotAnalyzed)
                    .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                )
                .String(n => n
                    .Name(c => c.StateAbbreviation)
                    .Index(FieldIndexOption.NotAnalyzed)
                    .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                )
                .String(n => n
                    .Name(c => c.PostalCode)
                    .Index(FieldIndexOption.NotAnalyzed)
                    .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                )
                .String(n => n
                    .Name(c => c.Country)
                    .Index(FieldIndexOption.NotAnalyzed)
                    .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName)) 
                )
                .Number(n => n
                    .Name(c => c.AverageMonthlySales)
                    .Type(NumberType.Double)
                    .CopyTo(fs => fs.Field(Constants.ElasticSearch.CombinedSearchFieldName))
                )
                .String(n => n
                    .Name(Constants.ElasticSearch.CombinedSearchFieldName)
                    .Index(FieldIndexOption.Analyzed)
                    .Analyzer(Constants.ElasticSearch.AnalyzerNames.FuzzyAnalyzer)
                    .SearchAnalyzer(Constants.ElasticSearch.AnalyzerNames.FuzzyAnalyzer)
                )
                .Nested<ServiceModel.DtoTypes.Customer.RemarkForSearch>(s => s
                    .Name(n => n.Remarks)
                    .AutoMap()
                )
            )
        )
    );


var response = client.CreateIndex(customerSearchIdxDesc);

Loading the index:加载索引:

        var searchResults = Db.SqlList<DtoTypes.Customer.SearchResult>("EXEC [Customer].[RetrieveAllForSearch]");
        var remarkResults = Db.SqlList<DtoTypes.Customer.RemarkForSearch>("EXEC [Customer].[RetrieveAllSearchableRemarks]");

        foreach(var i in searchResults)
        {
            i.Remarks = remarkResults.Where(m => m.CustomerId == i.CustomerId).ToList();
        }

        var settings = new ConnectionSettings(Constants.ElasticSearch.Node);
        var client = new ElasticClient(settings);

        // Flush the index
        var flushResponse = client.Flush(Constants.ElasticSearch.CustomerSearchIndexName);

        // Refresh index
        var indexResponse = client.IndexMany(searchResults, Constants.ElasticSearch.CustomerSearchIndexName);

Querying the Index:查询索引:

var searchDescriptor = new SearchDescriptor<DtoTypes.Customer.SearchResult>()
    .From(0)
    .Take(Constants.ElasticSearch.MaxResults)
    .Query(q => q
        .Nested(c => c
            .Path(p => p.Remarks)
            .Query(nq => nq
                .Match(m => m
                    .Query(query)
                    .Field("remarks.remarktext")
                )
            )
        )
    );

response = client.Search<DtoTypes.Customer.SearchResult>(searchDescriptor);

I don't know if I'm bulk loading the index properly and if its smart enough to know that the Remarks property is a nested property and to load those as well.我不知道我是否正确地批量加载索引,以及它是否足够聪明以知道 Remarks 属性是一个嵌套属性并加载它们。

The search has no errors, but I get no results.搜索没有错误,但我没有得到任何结果。

The search query is generating this json, which from what I can tell is OK:搜索查询正在生成这个 json,据我所知是可以的:

{
  "from": 0,
  "size": 100,
  "query": {
    "nested": {
      "query": {
        "match": {
          "remarks.remarktext": {
            "query": "test"
          }
        }
      },
      "path": "remarks"
    }
  }
}

I do see the remark data when looking at json using a query string http://127.0.0.1:9200/customersearch/_search使用查询字符串http://127.0.0.1:9200/customersearch/_search查看 json 时,我确实看到了备注数据

I want to be able to search nested objects and return the parents - only the parents, without the list of Remarks, but I would like highlights from the remarks returned if possible.我希望能够搜索嵌套对象并返回父对象 - 只有父对象,没有备注列表,但如果可能的话,我希望从返回的备注中突出显示。

What about this idea.这个想法怎么样。 Let's exclude nested object from source but leave highlight on nested field in place.让我们从源中排除嵌套对象,但在嵌套字段上保持高亮显示。 What I mean.我的意思是。

public class Document
{
    public int Id { get; set; } 

    [Nested]
    public Nested Nested { get; set; }
}


var createIndexResponse = client.CreateIndex(indexName, descriptor => descriptor
    .Mappings(map => map
        .Map<Document>(m => m
            .AutoMap()
        )));


var items = new List<Document>
{
    new Document
    {
        Id = 1,  
        Nested = new Nested {Name = "Robert" }
    },
    new Document
    {
        Id = 2, 
        Nested = new Nested {Name = "Someone" }
    }
};

var bulkResponse = client.IndexMany(items);

client.Refresh(indexName);


var searchResponse = client.Search<Document>(s => s
    .Source(so => so.Exclude(e => e.Field(f => f.Nested)))
    .Highlight(h => h.Fields(f => f.Field("nested.name")).PostTags("<b>").PreTags("<b>"))
    .Query(q => q
        .Nested(n => n
            .Path(p => p.Nested)
            .Query(nq => nq.Match(m => m
                .Query("Robert").Field("nested.name"))))));

And what elasticsearch returns is而elasticsearch返回的是

{
    "took" : 3,
    "timed_out" : false,
    "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
    },
    "hits" : {
        "total" : 1,
        "max_score" : 1.0,
        "hits" : [{
                "_index" : "my_index",
                "_type" : "document",
                "_id" : "1",
                "_score" : 1.0,
                "_source" : {
                    "id" : 1
                },
                "highlight" : {
                    "nested.name" : ["<a>Robert<a>"]
                }
            }
        ]
    }
}

What do you think?你怎么认为?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM