Elasticsearch中具有順序ID的批量索引數據

Question

我正在使用此代碼使用python在Elasticsearch中批量索引所有數據：

from elasticsearch import Elasticsearch, helpers
import json
import os
import sys
import sys, json

es = Elasticsearch()   

def load_json(directory):
    for filename in os.listdir(directory):
        if filename.endswith('.json'):
            with open(filename,'r') as open_file:
                yield json.load(open_file)

helpers.bulk(es, load_json(sys.argv[1]), index='v1_resume', doc_type='candidate')

我知道，如果沒有提到ID，ES本身會給出一個20個字符長的ID，但是我希望它從ID = 1開始直到文檔數被索引。

我該如何實現？

Answer 1

在彈性的搜索，如果你不挑ID為您的文檔的ID將自動為您創建，檢查這里的彈性文檔：

Autogenerated IDs are 20 character long, URL-safe, Base64-encoded GUID 
strings. These GUIDs are generated from a modified FlakeID scheme which 
allows multiple nodes to be generating unique IDs in parallel with 
essentially zero chance of collision.

如果您想擁有自定義ID，則需要使用類似的語法自行構建它們：

[
    {'_id': 1,
     '_index': 'index-name',
     '_type': 'document',
     '_source': {
          "title": "Hello World!",
          "body": "..."}

    },
    {'_id': 2,
     '_index': 'index-name',
     '_type': 'document',
     '_source': {
          "title": "Hello World!",
          "body": "..."}
    }
]

helpers.bulk(es, load_json(sys.argv[1])

由於您要在schema對type和index進行貼圖，因此不必在helpers.bulk()方法中進行操作。 您需要更改'load_json'的輸出以創建包含要保存在es中的字典（如上）的列表（ python elastic client docs ）

Elasticsearch中具有順序ID的批量索引數據

問題描述

1 個解決方案

解決方案1
0 2017-05-16 12:26:04

Elasticsearch中具有順序ID的批量索引數據

問題描述

1 個解決方案

解決方案1 0 2017-05-16 12:26:04

解決方案1
0 2017-05-16 12:26:04