简体   繁体   中英

Metadata extraction with Apache Jackrabbit

I was using Alfresco a little bit and there were a thin abstraction layer above Apache Tika for extracting metadata from documents.

I decided to use only Jackrabbit because I don't need such a robust solution. But except jackrabbit-text-extractors module I don't see any other support for document metadata (Dublin core properties).

More over the jackrabbit-text-extractors maven artifact version 2.0-SNAPSHOT doesn't seem to be in SVN trunk.

Could please anybody tell me what approach to choose ?

my understanding is that jackrabbit uses tika for text extraction now

https://issues.apache.org/jira/browse/JCR-1878

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM