简体   繁体   中英

How to erase ElasticSearch index?

My unit/integration tests includes tests for search functionality.

My idea is to have empty search index before each test. So, I'm trying to remove all elements in index on setup method (it's Groovy code):

Client client = searchConnection.client

SearchResponse response = client.prepareSearch("item")
    .setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
    .setQuery(termQuery('name', 'test')) //tried also matchAllQuery()
    .setFrom(0).setSize(100).setExplain(false).execute().actionGet()

List<String> ids = response.hits.hits.collect {
    return it.id
}
client.close()

client = searchConnection.client

ids.each {
    DeleteResponse delete = client.prepareDelete("item", "item", it)
        .setOperationThreaded(false)
        .execute().actionGet()
}

client.close()

Seems that it's processing all deletions asynchronously, so I've added Thread.sleep(5000) after it. As you see i'm trying to open/close connection few times - it doesn't help there.

The problem that sometimes it requires more time, sometimes it needs more that 5 seconds to delete, sometimes it can't find just added data (from previous test), etc, etc. And most annoying that integration tests becomes unstable. Putting Thread.sleep() everywhere where it's possible looks as not so good solution.

It there any way to commit last changes, or make an lock until all data will be written?

Found solution:

IndicesAdminClient adminClient = searchConnection.client.admin().indices();
String indexName = "location";
DeleteIndexResponse delete = adminClient.delete(new DeleteIndexRequest(indexName)).actionGet()
if (!delete.isAcknowledged()) {
    log.error("Index {} wasn't deleted", indexName);
}

and

client.admin().indices().flush(new FlushRequest('location')).actionGet();

after putting new data into index.

First of all you don't have to clear all data by issuing a delete on each doc id. You can just delete all data with delete by query matching all documents http://www.elasticsearch.org/guide/reference/api/delete-by-query.html Having that said I don't recommend that either, because it's not recommended to do this often on large doc collections (see docs).

What you really want to do is delete the whole index (it's fast) http://www.elasticsearch.org/guide/reference/api/admin-indices-delete-index.html , recreate it, put in data and this is important refresh the index to "commit" the changes and make them visible. http://www.elasticsearch.org/guide/reference/api/admin-indices-refresh.html

I do this in my tests and never had a problem.

  1. it is not the async call (you can add a listener and avoid actionGet to get the async call)
  2. delete all items via:

     client.prepareDeleteByQuery(indexName). setQuery(QueryBuilders.matchAllQuery()). setTypes(indexType). execute().actionGet(); 
  3. refresh your index to see the changes (only required in unit tests)

My idea is to have empty search index before each test

So create a new index at the start of the test, don't re-use the old one. You're guaranteed an empty one then. In the tests's teardown, you can then delete the test index.

It there any way to commit last changes, or make an lock until all data will be written?

No, ElasticSearch has no transactions or locking.

If you don't want to create new index each time, then try adding a loop which checks to see if the index is empty, then waits and tries again, until it is.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM