简体   繁体   English

NEST Elasticsearch Reindex示例

[英]NEST Elasticsearch Reindex examples

my objective is to reindex an index with 10 million shards for the purposes of changing field mappings to facilitate significant terms analysis. 我的目标是重新索引一个包含1000万个分片的索引,以便更改字段映射,以便进行重要的术语分析。

My problem is that I am having trouble using the NEST library to perform a re-index, and the documentation is (very) limited. 我的问题是我在使用NEST库执行重新索引时遇到问题,文档(非常)受限。 If possible I need an example of the following in use: 如果可能,我需要使用以下示例:

http://nest.azurewebsites.net/nest/search/scroll.html http://nest.azurewebsites.net/nest/search/scroll.html

http://nest.azurewebsites.net/nest/core/bulk.html http://nest.azurewebsites.net/nest/core/bulk.html

NEST provides a nice Reindex method you can use, although the documentation is lacking. NEST提供了一个很好的Reindex方法,你可以使用它,尽管缺少文档。 I've used it in a very rough-and-ready fashion with this ad-hoc WinForms code. 我使用这个特殊的WinForms代码以非常粗略和准备的方式使用它。

    private ElasticClient client;
    private double count;

    private void reindex_Completed()
    {
        MessageBox.Show("Done!");
    }

    private void reindex_Next(IReindexResponse<object> obj)
    {
        count += obj.BulkResponse.Items.Count();
        var progress = 100 * count / (double)obj.SearchResponse.Total;
        progressBar1.Value = (int)progress;
    }

    private void reindex_Error(Exception ex)
    {
        MessageBox.Show(ex.ToString());
    }

    private void button1_Click(object sender, EventArgs e)
    {
        count = 0;

        var reindex = client.Reindex<object>(r => r.FromIndex(fromIndex.Text).NewIndexName(toIndex.Text).Scroll("10s"));

        var o = new ReindexObserver<object>(onError: reindex_Error, onNext: reindex_Next, completed: reindex_Completed);
        reindex.Subscribe(o);
    }

And I've just found the blog post that showed me how to do it: http://thomasardal.com/elasticsearch-migrations-with-c-and-nest/ 我刚刚发现博客文章向我展示了如何做到这一点: http//thomasardal.com/elasticsearch-migrations-with-c-and-nest/

Unfortunately the NEST implementation is not quite what I expected. 不幸的是, NEST实施并不像我预期的那样。 In my opinion it's a bit over-engineered for possibly the most common use case. 在我看来,它可能是最常见的用例有点过度设计。

Alot of people just want to update their mappings with zero downtime... 很多人只想更新他们的映射,零停机时间......

In my case - I had already taken care of creating the index with all its settings and mappings, but NEST insists that it must create a new index when reindexing. 在我的情况下 - 我已经负责创建具有所有设置和映射的索引,但是NEST坚持认为它必须在重建索引时创建一个新索引。 That among many other things. 这还有很多其他的事情。 Too many other things. 太多其他的事情。

I found it much less complicated to just implement directly - since NEST already has Search , Scroll , and Bulk methods. 我发现直接实现它要简单得多 - 因为NEST已经有了SearchScrollBulk方法。 (this is adopted from NEST 's implementation): (这是从NEST的实施中采用的):

// Assuming you have already created and setup the index yourself
public void Reindex(ElasticClient client, string aliasName, string currentIndexName, string nextIndexName)
{
    Console.WriteLine("Reindexing documents to new index...");
    var searchResult = client.Search<object>(s => s.Index(currentIndexName).AllTypes().From(0).Size(100).Query(q => q.MatchAll()).SearchType(SearchType.Scan).Scroll("2m"));
    if (searchResult.Total <= 0)
    {
        Console.WriteLine("Existing index has no documents, nothing to reindex.");
    }
    else
    {
        var page = 0;
        IBulkResponse bulkResponse = null;
        do
        {
            var result = searchResult;
            searchResult = client.Scroll<object>(s => s.Scroll("2m").ScrollId(result.ScrollId));
            if (searchResult.Documents != null && searchResult.Documents.Any())
            {
                searchResult.ThrowOnError("reindex scroll " + page);
                bulkResponse = client.Bulk(b =>
                {
                    foreach (var hit in searchResult.Hits)
                    {
                        b.Index<object>(bi => bi.Document(hit.Source).Type(hit.Type).Index(nextIndexName).Id(hit.Id));
                    }

                    return b;
                }).ThrowOnError("reindex page " + page);
                Console.WriteLine("Reindexing progress: " + (page + 1) * 100);
            }

            ++page;
        }
        while (searchResult.IsValid && bulkResponse != null && bulkResponse.IsValid && searchResult.Documents != null && searchResult.Documents.Any());
        Console.WriteLine("Reindexing complete!");
    }

    Console.WriteLine("Updating alias to point to new index...");
    client.Alias(a => a
        .Add(aa => aa.Alias(aliasName).Index(nextIndexName))
        .Remove(aa => aa.Alias(aliasName).Index(currentIndexName)));

    // TODO: Don't forget to delete the old index if you want
}

And the ThrowOnError extension method in case you want it: ThrowOnError扩展方法,如果你想要它:

public static T ThrowOnError<T>(this T response, string actionDescription = null) where T : IResponse
{
    if (!response.IsValid)
    {
        throw new CustomExceptionOfYourChoice(actionDescription == null ? string.Empty : "Failed to " + actionDescription + ": " + response.ServerError.Error);
    }

    return response;
}

I second Ben Wilde's answer above. 我是第二个Ben Wilde的回答。 Better to have full control over index creation and the re-index process. 最好完全控制索引创建和重新索引过程。

What's missing from Ben's code is support for parent/child relationship. Ben的代码中缺少的是对父/子关系的支持。 Here is my code to fix that: 这是我的代码来解决这个问题:

Replace the following lines: 替换以下行:

foreach (var hit in searchResult.Hits)
{
    b.Index<object>(bi => bi.Document(hit.Source).Type(hit.Type).Index(nextIndexName).Id(hit.Id));
}

With this: 有了这个:

foreach (var hit in searchResult.Hits)
{
    var jo = hit.Source as JObject;
    JToken jt;
    if(jo != null && jo.TryGetValue("parentId", out jt))
    {
        // Document is child-document => add parent reference
        string parentId = (string)jt;
        b.Index<object>(bi => bi.Document(hit.Source).Type(hit.Type).Index(nextIndexName).Id(hit.Id).Parent(parentId));
    }
    else
    {
        b.Index<object>(bi => bi.Document(hit.Source).Type(hit.Type).Index(nextIndexName).Id(hit.Id));
    }                                
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM