简体   繁体   中英

periodically flushing new documents to an index using lucene

I need to flush the index periodically.that's mean that the index will be regularly updated as the document being added.what do you reckon is the solution for this? I need a sample source code to be able to flush an index.

ok just like this source code below.

public class SimpleFileIndexer {
    public static void main(String[] args) throws Exception {
           File indexDir = new File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
           File dataDir = new File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
           String suffix = "txt";

           SimpleFileIndexer indexer = new SimpleFileIndexer();

           int numIndex = indexer.index(indexDir, dataDir, suffix);

           System.out.println("Total files indexed " + numIndex);
    }

    private int index(File indexDir, File dataDir, String suffix) throws Exception {
           IndexWriter indexWriter = new IndexWriter(
                           FSDirectory.open(indexDir),
                           new SimpleAnalyzer(),
                           true,
                           IndexWriter.MaxFieldLength.LIMITED);
           indexWriter.setUseCompoundFile(false);

           indexDirectory(indexWriter, dataDir, suffix);

           int numIndexed = indexWriter.maxDoc();
           indexWriter.optimize();
           indexWriter.close();

           return numIndexed;
    }

    private void indexDirectory(IndexWriter indexWriter, File dataDir, String suffix) throws IOException {
           File[] files = dataDir.listFiles();
           for (int i = 0; i < files.length; i++) {
                   File f = files[i];
                   if (f.isDirectory()) {
                           indexDirectory(indexWriter, f, suffix);
                   }
                   else {
                           indexFileWithIndexWriter(indexWriter, f, suffix);
                   }
           }
    }

    private void indexFileWithIndexWriter(IndexWriter indexWriter, File f, String suffix) throws IOException {
           if (f.isHidden() || f.isDirectory() || !f.canRead() || !f.exists()) {
                   return;
           }
           if (suffix!=null && !f.getName().endsWith(suffix)) {
                   return;
           }
           System.out.println("Indexing file " + f.getCanonicalPath());

           Document doc = new Document();
           doc.add(new Field("contents", new FileReader(f)));
           doc.add(new Field("filename", f.getCanonicalPath(), Field.Store.YES, Field.Index.ANALYZED));

           indexWriter.addDocument(doc);
    }
}

the above source code can index documents when given the directory of text files. now what I am asking is how can I made the code to run continuously? what class should I use? so that everytime there is new documents added to that directory then lucene will index those documents automatically, can you help me out on this one. I really need to know what is the best solution.

Lucene can't do this by itself. You will need to monitor the filesystem for that.

Look at How to detect filesystem has changed in java .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM