简体   繁体   中英

Java Lucene IndexReader not working correctly

The story is this. I want to mimic the behavior of a relational database using a Lucene index in java. I need to be able to do searching(reading) and writing at the same time.

For example, I want to save Project information into an index. For simplicity, let's say that the project has 2 fields - id and name. Now, before adding a new project to the index, I'm searching if a project with a given id is already present. For this I'm using an IndexSearcher. This operation completes with success (namely the IndexSearcher returns the internal doc id for the document that contains the project id I'm looking for). Now I want to actually read the value of this project ID, so I'm using now an IndexReader to get the indexed Lucene document from which I can extract the project id field. The problem is that the IndexReader return a Document that has all of the fields NULL. So, to repeat IndexSearcher works correctly, IndexReader returns bogus stuff.

I'm thinking that somehow this has to do with the fact that the document fields data does not get saved on the hard disk when the IndexWriter is flushed. The thing is that the first time I do this indexing operation, IndexReader works good. However after a restart of my application, the above mentioned situation happens. So I'm thinking that the first time around data floats in RAM, but doesn't get flushed correctly (or totally since IndexSearcher works) on the hard drive.

Maybe it will help if I give you the source code, so here it is (you can safely ignore the tryGetIdFromMemory part, I'm using that as an speed optimization trick):

public class ProjectMetadataIndexer {
private File indexFolder;
private Directory directory;
private IndexSearcher indexSearcher;
private IndexReader indexReader;
private IndexWriter indexWriter;
private Version luceneVersion = Version.LUCENE_31;

private Map<String, Integer> inMemoryIdHolder;
private final int memoryCapacity = 10000;

public ProjectMetadataIndexer() throws IOException {
    inMemoryIdHolder = new HashMap<String, Integer>();

    indexFolder = new File(ConfigurationSingleton.getInstance()
            .getProjectMetaIndexFolder());

    directory = FSDirectory.open(indexFolder);
    IndexWriterConfig config = new IndexWriterConfig(luceneVersion,
            new WhitespaceAnalyzer(luceneVersion));
    indexWriter = new IndexWriter(directory, config);

    indexReader = IndexReader.open(indexWriter, false);

    indexSearcher = new IndexSearcher(indexReader);

}

public int getProjectId(String projectName) throws IOException {
    int fromMemoryId = tryGetProjectIdFromMemory(projectName);
    if (fromMemoryId >= 0) {
        return fromMemoryId;
    } else {
        int projectId;

        Term projectNameTerm = new Term("projectName", projectName);
        TermQuery projectNameQuery = new TermQuery(projectNameTerm);

        BooleanQuery query = new BooleanQuery();
        query.add(projectNameQuery, Occur.MUST);

        TopDocs docs = indexSearcher.search(query, 1);
        if (docs.totalHits == 0) {
            projectId = IDStore.getInstance().getProjectId();
            indexMeta(projectId, projectName);
        } else {
            int internalId = docs.scoreDocs[0].doc;
            indexWriter.close();
            indexReader.close();
            indexSearcher.close();

            indexReader = IndexReader.open(directory);
            Document document = indexReader.document(internalId);
            List<Fieldable> fields = document.getFields();
            System.out.println(document.get("projectId"));
            projectId = Integer.valueOf(document.get("projectId"));
        }

        storeInMemory(projectName, projectId);

        return projectId;
    }
}

private int tryGetProjectIdFromMemory(String projectName) {
    String key = projectName;
    Integer id = inMemoryIdHolder.get(key);
    if (id == null) {
        return -1;
    } else {
        return id.intValue();
    }
}

private void storeInMemory(String projectName, int projectId) {
    if (inMemoryIdHolder.size() > memoryCapacity) {
        inMemoryIdHolder.clear();
    }
    String key = projectName;
    inMemoryIdHolder.put(key, projectId);
}

private void indexMeta(int projectId, String projectName)
        throws CorruptIndexException, IOException {
    Document document = new Document();

    Field idField = new Field("projectId", String.valueOf(projectId),
            Store.NO, Index.ANALYZED);
    document.add(idField);

    Field nameField = new Field("projectName", projectName, Store.NO,
            Index.ANALYZED);
    document.add(nameField);

    indexWriter.addDocument(document);
}

public void close() throws CorruptIndexException, IOException {
    indexReader.close();
    indexWriter.close();
}

}

To be more precise all the problems occur in this if:

if (docs.totalHits == 0) {
        projectId = IDStore.getInstance().getProjectId();
        indexMeta(projectId, projectName);
    } else {
        int internalId = docs.scoreDocs[0].doc;

        Document document = indexReader.document(internalId);
        List<Fieldable> fields = document.getFields();
        System.out.println(document.get("projectId"));
        projectId = Integer.valueOf(document.get("projectId"));
    }

On the else branch... I don't know what is wrong.

Do you store the respective fields? If not, the fields are "only" stored in the reverse index part, ie the field value is mapped to the document, but the document itself doesn't contain the field value.

The part of the code where you save the document might be helpful.

I had a hard time figuring out how to index/search numbers and I just wanted to say the following snippets of code really helped me out:

projectId = Integer.valueOf(document.get("projectId"));

////////////

Field idField = new Field("projectId", String.valueOf(projectId),
            Store.NO, Index.ANALYZED);
    document.add(idField);

Thanks!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM