简体繁体中英

Indexing zip files with Lucene

原文 2013-02-15 05:46:20 4 1 java/ lucene

Is it possible to index zipped folders in lucene. If i unzip it the content is too large. If i just index the bunch of zipped folders containing textfiles, The serach does not work properly. Is it possible for lucene to index with out extracting the zip file.

1 answers

Lucene is just a search library and there's no way it can "know" every possible scenario - eg how to index XML documents, word files, files inside .zip, files created by Chernobyl power plant, etc.

But what Lucene does it to provide the API for you to hook your data into Lucene.

If unzipping the contents of the archive file is not an option, you could write a class that reads the zip file (but does not unzip it on the disk) and feeds this data into Lucene.

If your primary concern is the size of the index, there's nothing much you can do to reduce it. There are a few tips though:

try indexing without stopwords
do not store the fields, only index them (hint: Field.Store.NO )
always lowercase all terms to reduce term count

lucene indexing of html files

Apache Solr - Indexing ZIP files

Indexing files from sub directories in Lucene

Search among Lucene pre-indexing files using PHP?

Too many open files in Lucene Indexing when number of users increase

Indexing and Searching Date in Lucene

Indexing Performance in Apache Lucene

Apache lucene indexing

incremental indexing lucene

Lucene Analyzer for Indexing and Searching

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question lucene indexing of html files Apache Solr - Indexing ZIP files Indexing files from sub directories in Lucene Search among Lucene pre-indexing files using PHP? Too many open files in Lucene Indexing when number of users increase Indexing and Searching Date in Lucene Indexing Performance in Apache Lucene Apache lucene indexing incremental indexing lucene Lucene Analyzer for Indexing and Searching

Related Tags

Indexing zip files with Lucene

Question

1 answers

solution1 1 ACCPTED 2013-02-15 11:09:21

solution1
1 ACCPTED 2013-02-15 11:09:21