简体繁体中英

Indexing HTML files using SOLR

原文 2013-02-22 07:55:06 0 1 solr/ lucene/ indexing

Am trying to index a set of HTML files using SOLR. Basic idea is to implement a site search functionality for the website developed. Am very new to Lucene and SOLR and have tried a few samples available in the site and have indexed a few documents using that. But am not able to arrive at a conclusion as to what would be the best way of doing things. Some suggest use DataImportHandler, some places i see using ExtractingRequestHandler. A simple try from my side was using ExtractingRequestHandler. lso I will have to update the list of files for example, some HTMLs may be removed in the future and some may be added and etc etc.. Pl suggest on factors to be considered while choosing the approach

Cheers!!

1 answers

I would recommend you use Nutch to crawl and index your HTML files into Solr. It has built in support for tracking the removal/addition of files to the site.

Also check out the Nutch Wiki for tutorials on getting started.

Indexing log files using Solr

Indexing pdf and html files in solr shows error in html indexing

Indexing Arabic PDF Files using Solr

Indexing Text files using apache solr and tika

Using Solr for indexing HTML tags with attributes

Indexing SVG files with SOLR

Indexing HTML with solr

HTML indexing with solr

Solr indexing HTML entities

Indexing HTML in Solr DataImportHandler

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Indexing log files using Solr Indexing pdf and html files in solr shows error in html indexing Indexing Arabic PDF Files using Solr Indexing Text files using apache solr and tika Using Solr for indexing HTML tags with attributes Indexing SVG files with SOLR Indexing HTML with solr HTML indexing with solr Solr indexing HTML entities Indexing HTML in Solr DataImportHandler

Related Tags

Indexing HTML files using SOLR

Question

1 answers

solution1 0 ACCPTED 2013-02-22 13:38:26

solution1
0 ACCPTED 2013-02-22 13:38:26