简体   繁体   中英

How can I send multiple documents to Elasticsearch datastream using Python?

I am trying to index a large number of documents in Python to Elasticsearch, after reading the documentation, they refer to this example .

This example works great when I am indexing into a normal index, however, when I try to index into a datastream, even into a brand new datastream, that can accept dynamic content, I get this error:

Traceback (most recent call last):
  File "/Users/Downloads/elasticsearch-py-main/examples/bulk-ingest/bulk-ingest.py", line 111, in <module>
    main()
  File "/Users/Downloads/elasticsearch-py-main/examples/bulk-ingest/bulk-ingest.py", line 102, in main
    for ok, action in bulk(
  File "/opt/homebrew/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 524, in bulk
    for ok, item in streaming_bulk(
  File "/opt/homebrew/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 438, in streaming_bulk
    for data, (ok, info) in zip(
  File "/opt/homebrew/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 355, in _process_bulk_chunk
    yield from gen
  File "/opt/homebrew/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 274, in _process_bulk_chunk_success
    raise BulkIndexError(f"{len(errors)} document(s) failed to index.", errors)
elasticsearch.helpers.BulkIndexError: 2 document(s) failed to index.

I cannot find any information on this, how can I index my data in bulk using the Elasticsearch Python connector?

This is probably because when sending documents to a data stream you need to set the action to create instead of index

{ "create": {"_id": "123"}}
{ "field": "value" }

With the Python bulk helpers , you need to explicitly set '_op_type': 'create' in your bulk actions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM