简体   繁体   中英

Bulk Update with Elasticsearch in Python

I was successfully pushing my docs to my Elasticsearch and already checked it in my Kibana.

My current code is look like this:

try:
    res = helpers.bulk(es, my_function(df))
    print("Working")
except Exception as e:
    print(e)

and this is "my_function" code:

def my_function(df):
    for c, line in enumerate(df):
        yield {
            '_index': 'my_index',
            '_type': '_doc',
            '_id': line.get("_id", None),
            '_source': {
                'field_A': line.get('field', "")
            }
        }
    raise StopIteration

Then I wonder what if I run the Python script again in the future to push just some new docs to Elasticsearch. Does anyone have an idea on how to do this?

It only depends on the _id you're sending, if it's the same, then there won't be any duplicate, the new version will override the old version.

So not much to worry about, the new documents that don't exist yet, will be indexed and the updates of existing documents will override their older version, provided they are sent with the same _id .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM