简体   繁体   中英

How can I use a byte array as a Lucene index field?

I switched a Lucene index from using string document ids to byte arrays. The problem I'm having is that the system no longer finds documents by their id. I suspect this is because the lucene code is not doing an Array.equals(), but rather a standard equals(). This is the code to add the document:

Document doc = new Document();
byte[] key = indexData.getKey().toByteArray();
System.out.println(Arrays.toString(key));
doc.add(new StoredField(DOCUMENT_PRIMARY_KEY, new BytesRef(key)));
writer.addDocument(doc);

And this is the code to delete the document. The delete fails because the document is not found (although it does exist in the index).

void prepareDelete(byte[] documentId) throws IOException {
    System.out.println(Arrays.toString(documentId));
    Term term = 
            new Term(DOCUMENT_PRIMARY_KEY, new BytesRef(documentId));
    writer.deleteDocuments(term);
}

By comparing the output of the print statements, I've determined that the keys are the same (in the sense that they contain the same bytes) but they do not share identity.

I'm using Lucene 4.10.3.

From an answer posted on the Lucene mailing list:

You are indexing your field as a StoredField which means it's not actually indexed (just stored), so no query (nor IW.deleteDocument) will ever be able to find it.

Try StringField instead ... in recent versions you can pass a BytesRef value to that.

I updated to Lucene 5.3, in which StringField has a constructor that takes a BytesRef value, and this fixed the problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM