[英]elasticsearch bulk indexing using python
i am trying to index a csv file with 6M records to elasticsearch using python pyes module,the code reads a record line by line and pushes it to elasticsearch...any idea how i can send this as bulk? 我试图使用python pyes模块将具有6M记录的csv文件索引到elasticsearch,代码逐行读取记录并将其推送到elasticsearch ...任何想法如何将其作为批量发送?
import csv
from pyes import *
import sys
header = ['col1','col2','col3','col3', 'col4', 'col5', 'col6']
conn = ES('xx.xx.xx.xx:9200')
counter = 0
for row in reader:
#print len(row)
if counter >= 0:
if counter == 0:
pass
else:
colnum = 0
data = {}
for j in row:
data[header[colnum]] = str(j)
colnum += 1
print data
print counter
conn.index(data,'accidents-index',"accidents-type",counter)
else:
break
counter += 1
pyelasticsearch supports bulk indexing: pyelasticsearch支持批量索引:
bulk_index(index, doc_type, docs, id_field='id', parent_field='_parent'[, other kwargs listed below])
For example, 例如,
cities = []
for line in f:
fields = line.rstrip().split("\t")
city = { "id" : fields[0], "city" : fields[1] }
cities.append(cities)
if len(cities) == 1000:
es.bulk_index(es_index, "city", cities, id_field="id")
cities = []
if len(cities) > 0:
es.bulk_index(es_index, "city", cities, id_field="id")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.