简体繁体中英

Apache Lucene TokenStream Filters

原文 2012-08-23 19:41:48 3 1 java/ lucene/ machine-learning

I have some questions regarding Apache Lucene Library:

1) How can I concatenate two TokenStream objects into one TokenStream object ?

2) Which filter can be used to remove all duplicate tokens (with same value) from a TokenStream object ?

Thanks in Advance

1 answers

As far as concatenating from two sources, just add two Field instances with the same name to the Document . This is guaranteed to be the same as a single field with the value concatenated.

As far as eliminating duplicated terms, this is not really necessary. Lucene will only count the term frequency for a document in order to score them higher. If you don't need that, you can define your own Similarity instance that implements tf as a constant of 1.

Or, if you need to disable term frequency per field only, you can instantiate the Field with Field.TermVector.NO .

Apache Lucene TokenStream contract violation

Adding tokens to a lucene tokenstream

Lucene Highlighter TokenStream exception

Lucene Customize TokenStream

Lucene TokenStream Exception

java.lang.VerifyError: class org.apache.lucene.analysis.ReusableAnalyzerBase overrides final method tokenStream

Apache Lucene: How to use TokenStream to manually accept or reject a token when indexing

Lucene 4.0 overrides final method tokenStream

Java | Lucene | TokenStream fields cannot be stored

How to remove numbers from TokenStream in Lucene?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Apache Lucene TokenStream contract violation Adding tokens to a lucene tokenstream Lucene Highlighter TokenStream exception Lucene Customize TokenStream Lucene TokenStream Exception java.lang.VerifyError: class org.apache.lucene.analysis.ReusableAnalyzerBase overrides final method tokenStream Apache Lucene: How to use TokenStream to manually accept or reject a token when indexing Lucene 4.0 overrides final method tokenStream Java | Lucene | TokenStream fields cannot be stored How to remove numbers from TokenStream in Lucene?

Related Tags

Apache Lucene TokenStream Filters

Question

1 answers

solution1 0 2012-08-24 08:48:27

solution1
0 2012-08-24 08:48:27