简体   繁体   中英

Autocomplete using Hibernate Search

I am trying to build a better autocomplete feature for my website. I want to use Hibernate Search for this but as far as I experimented it only finds full words for me.

So, my question: is it possible to search for some characters only ?

eg. user types 3 letters and using hibernate search to show him all words of my db objects which contains those 3 letter?

PS. right now I am using a "like" query for this...but my db grown a lot and I want also to extend the search functionality over another tables...

Major edit One year on and I was able to improve on the original code I posted to produce this:

My indexed entity:

@Entity
@Indexed
@AnalyzerDef(name = "myanalyzer",
// Split input into tokens according to tokenizer
tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), //
filters = { //
// Normalize token text to lowercase, as the user is unlikely to care about casing when searching for matches
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
// Index partial words starting at the front, so we can provide Autocomplete functionality
@TokenFilterDef(factory = NGramFilterFactory.class, params = { @Parameter(name = "maxGramSize", value = "1024") }),
// Close filters & Analyzerdef
})
@Analyzer(definition = "myanalyzer")
public class Compound extends DomainObject {
public static String[] getSearchFields(){...}
...
}

All @Field s are tokenized and stored in the index; required for this to work:
@Field(index = Index.TOKENIZED, store = Store.YES)

@Transactional(readOnly = true)
public synchronized List<String> getSuggestions(final String searchTerm) {
    // Compose query for term over all fields in Compound
    String lowerCasedSearchTerm = searchTerm.toLowerCase();

    // Create a fullTextSession for the sessionFactory.getCurrentSession()
    FullTextSession fullTextSession = Search.getFullTextSession(getSession());

    // New DSL based query composition
    SearchFactory searchFactory = fullTextSession.getSearchFactory();
    QueryBuilder buildQuery = searchFactory.buildQueryBuilder().forEntity(Compound.class).get();
    TermContext keyword = buildQuery.keyword();
    WildcardContext wildcard = keyword.wildcard();
    String[] searchfields = Compound.getSearchfields();
    TermMatchingContext onFields = wildcard.onField(searchfields[0]);
    for (int i = 1; i < searchfields.length; i++)
        onFields.andField(searchfields[i]);
    TermTermination matching = onFields.matching(input.toLowerCase());
    Query query = matching.createQuery();

    // Convert the Search Query into something that provides results: Specify Compound again to be future proof
    FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery(query, Compound.class);
    fullTextQuery.setMaxResults(20);

    // Projection does not work on collections or maps which are indexed via @IndexedEmbedded
    List<String> projectedFields = new ArrayList<String>();
    projectedFields.add(ProjectionConstants.DOCUMENT);
    List<String> embeddedFields = new ArrayList<String>();
    for (String fieldName : searchfields)
        if (fieldName.contains("."))
            embeddedFields.add(fieldName);
        else
            projectedFields.add(fieldName);

    @SuppressWarnings("unchecked")
    List<Object[]> results = fullTextQuery.setProjection(projectedFields.toArray(new String[projectedFields.size()])).list();

    // Keep a list of suggestions retrieved by search over all fields
    List<String> suggestions = new ArrayList<String>();
    for (Object[] projectedObjects : results) {
        // Retrieve the search suggestions for the simple projected field values
        for (int i = 1; i < projectedObjects.length; i++) {
            String fieldValue = projectedObjects[i].toString();
            if (fieldValue.toLowerCase().contains(lowerCasedSearchTerm))
                suggestions.add(fieldValue);
        }

        // Extract the search suggestions for the embedded fields from the document
        Document document = (Document) projectedObjects[0];
        for (String fieldName : embeddedFields)
            for (Field field : document.getFields(fieldName))
                if (field.stringValue().toLowerCase().contains(lowerCasedSearchTerm))
                    suggestions.add(field.stringValue());
    }

    // Return the composed list of suggestions, which might be empty
    return suggestions;
}

There's some wrangling I'm doing at the end to handle @IndexedEmbedded fields. If you dont have those you can simplify the code a whole lot merely projecting the searchFields, and leaving out the document & embeddedField handling.

As before: Hopefully this is useful to the next person to encounter this question. Should anyone have any critique or improvements to the above posted code, feel free to edit and do please let me know.


Edit3 : The project this code was taken from has since been open sourced; Here are the relevant classes:

https://trac.nbic.nl/metidb/browser/trunk/metidb/metidb-core/src/main/java/org/metidb/domain/Compound.java
https://trac.nbic.nl/metidb/browser/trunk/metidb/metidb-core/src/main/java/org/metidb/dao/CompoundDAOImpl.java
https://trac.nbic.nl/metidb/browser/trunk/metidb/metidb-search/src/main/java/org/metidb/search/text/Autocompleter.java

You could index the field using an NGramFilter as suggested here . For best results you should use the EdgeNgramFilter from Apache Solr that creates ngrams from the beginning edge of a term and can be used in hibernate search as well.

Tim's answer is brilliant and helped me get over the difficult part. It worked only for a single word query for me. In case if anybody want it to make it work for phrase searches. Just replace all the 'Term' instances with their corresponding 'Phrase' classes. Here are the replacement lines for Tim's code that did the trick for me.

// New DSL based query composition
            //org.hibernate.search.query.dsl
            SearchFactory searchFactory = fullTextSession.getSearchFactory();
            QueryBuilder buildQuery = searchFactory.buildQueryBuilder().forEntity(MasterDiagnosis.class).get();
            PhraseContext keyword = buildQuery.phrase();
            keyword.withSlop(3);
            //WildcardContext wildcard = keyword.wildcard();
            String[] searchfields = MasterDiagnosis.getSearchfields();
            PhraseMatchingContext onFields = keyword.onField(searchfields[0]);
            for (int i = 1; i < searchfields.length; i++)
                onFields.andField(searchfields[i]);
            PhraseTermination matching = onFields.sentence(lowerCasedSearchTerm);
            Query query = matching.createQuery();
 // Convert the Search Query into something that provides results: Specify Compound again to be future proof

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM