简体   繁体   中英

Elasticsearch Java API fuzzy search test

I have problems with native elasticsearch java api. I wanted to create a method to search an Object by its name attribute. So far so easy, after that i wanted create a JUnit test for this method and here starts the problem.

    @Test
public void nameSearchTest() throws ElasticSearchUnavailableException, IOException{
    String nameToSearch = "fuzzyText";
    TrainingToCreate t = new TrainingToCreate();
    t.setName(nameToSearch);
    //Create two Trainings to find sth
    String id1 = ElasticIndexer.index(t);
    String id2 = ElasticIndexer.index(t);
    //For creating delay, throws Exception if id doesn't exist
    ElasticGetter.getTrainingById(id1);
    ElasticGetter.getTrainingById(id2);

    int hits = 0;
    ArrayList<Training> trainings = ElasticSearch.fuzzySearchTrainingByName(nameToSearch, Integer.MAX_VALUE, 0);
    System.out.println("First id: " + id1);
    System.out.println("Second id: " + id2);
    String idOfTraining;
    if(trainings.size() == 0){
        System.out.println("Zero hits could be found.");
    }
    //just for printing id's of results
    //-------------------------------------------------
    for (int i = 0; i < trainings.size(); i++) {
        idOfTraining = trainings.get(i).getId();
        System.out.println("Training: "+i+" id: "+ idOfTraining);
    }
    //-------------------------------------------------
    for (Training training : trainings) {
        if(training.getId().equals(id1)||training.getId().equals(id2)){
            hits++;
        }
    }
    assertTrue(hits>=2);
    ElasticDelete.deleteTrainingById(id1);
    ElasticDelete.deleteTrainingById(id2);
}

Sometimes this test works without a problem, other times the results of the search contains nothing, even if i have created some documents to assure that something could be found. But if i look in the database of elasticsearch the documents exists, so i guess my implentation isn't right or the search api has a serious delay.

Here the code that's being tested:

public static ArrayList<Training> fuzzySearchTrainingByName(String name, int size, int offset) throws ElasticSearchUnavailableException, JsonParseException, JsonMappingException, IOException {
    Client client = clientFactory.getClient(configService.getConfig().getElasticSearchIp(), configService
            .getConfig().getElasticSearchPort());
    return ElasticSearch.fuzzySearchDocument(client, "trainings", "training", "name", name, size, offset);
}

private static ArrayList<Training> fuzzySearchDocument(Client client, String index, String type, String field, String value, int size, int offset) throws JsonParseException, JsonMappingException, IOException {
    QueryBuilder query = fuzzyQuery(field, value);

    SearchResponse response = client.prepareSearch(index).setTypes(type)
            .setQuery(query).setSize(size).setFrom(offset).execute().actionGet();

    SearchHits hits = response.getHits();

    TrainingToCreate source = null;
    ObjectMapper mapper = new ObjectMapper();
    ArrayList<Training> trainings = new ArrayList<Training>();

    for (SearchHit searchHit : hits) {
        source = mapper.readValue(searchHit.getSourceAsString(), TrainingToCreate.class);
        trainings.add(TrainingFactory.getTraining(searchHit.getId(), source));
    }
    return trainings;

}

I am working at Java 8 with Elastic 1.7.0 Does anyone reconize the position of the problem? If anyone needs further information, please feel free to ask.

Elasticsearch is near real time , which means there is some latency (default 1s) between the moment you index a document and the moment it is searchable. You can overcome this by simply refreshing your indices before running your query.

So I would do it either just after you indexed your sample documents...

public void nameSearchTest() throws ElasticSearchUnavailableException, IOException{
    String nameToSearch = "fuzzyText";
    TrainingToCreate t = new TrainingToCreate();
    t.setName(nameToSearch);
    //Create two Trainings to find sth
    String id1 = ElasticIndexer.index(t);
    String id2 = ElasticIndexer.index(t);

    // REFRESH YOUR INDICES (just after indexing)
    client().admin().indices().prepareRefresh().execute().actionGet();

... or just at the very beginning of fuzzySearchDocument

 private static ArrayList<Training> fuzzySearchDocument(Client client, String index, String type, String field, String value, int size, int offset) throws JsonParseException, JsonMappingException, IOException {
     // REFRESH YOUR INDICES (just before searching)
     client().admin().indices().prepareRefresh().execute().actionGet();

     QueryBuilder query = fuzzyQuery(field, value);
     ...

If you run several test cases on the sample documents, I would go with the first option, otherwise any option will do.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM