I need to insert about 1.5 million documents to Elasticsearch databse. I do it via PHP library Elastica according this example (BULK example)
I would like to know if it is posible to use call $elasticaType->getIndex()->refresh();
command at the very end of bulks insertion and if it is safe and faster than call $elasticaType->getIndex()->refresh();
after every bulk sending. I mean something like this:
$offset = 0;
$limit = 500;
$sum = 1500000,
while( $offset < $sum )
{
$documents = [];
$rows = $sqlDatabase->getData( $offset, $limit )
foreach( $rows as $row )
{
$docData = ['name' => $row->name, 'email' => $row->email]
$documents[] = new \Elastica\Document( $data->id, $docData );
}
$elasticaType->addDocuments( $documents );
$offset += 500;
// Source example has refresh here. After every 500 items. But I wont it at the very end of the code after all 1500000 item are in the database.
// $elasticaType->getIndex()->refresh();
}
$elasticaType->getIndex()->refresh(); // This is what I want.
Is it possible to insert 1500000 documents to elasticsearch and then call $elasticaType->getIndex()->refresh();
?
Is it possible to insert 1500000 documents to elasticsearch and then call $elasticaType->getIndex()->refresh();?
Definitely yes.
A refresh makes your document available for search, This mechanism is derived from Apache Lucene to provide near real-time (NRT) search capabilities, It uses DirectoryReader.openIfChanged to reopen index.
Usually you don't have to do it yourself, a refresh is scheduled periodically by default, You can change the value of refresh_interval to shorter time for NRT search, or longer for performance.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.