简体   繁体   中英

Apache Stanbol scalability and real-world applications

I'm starting a project with requirements such as NLP, storage of semantic data, content managment etc. and Apache Stanbol seems like a nice fit, but I'm not exactly sure it's ready so I'm trying to make an appropriate assessment before starting to work with it, as there are few things that worry me:

  1. Stanbol seems a bit young and immature (newest version 0.12). Has anybody used it in a commercial project/application/setup (I failed to find this information online)? What is the scale of those projects?

  2. How horizontally scalable is Stanbol? What are its cloud/clustering capabilities? As far as I know it relies on Apache Jena for storage, and Jena storage isn't horizontally scalable which would make Stanbol unable to scale horizontally as well. I might be wrong about this, but this is my current understanding, please correct me if I'm wrong. Maybe Jena can be swapped with something else to be used as RDF storage provider and I'm not aware of it?

  3. Learning resources for Stanbol seem a little scarce. Does anyone know of a place/book/whatever where I can get more understanding about Stanbol under the hood (other than the official Stanbol website and the IKS website)? Are there any good alternatives? I know there are nice alternatives regarding NLP (eg GATE, UIMA), but they lack CMS capabilities.

Thanks.

To your question:

  • 1) I've been working on a project involving Stanbol(version 0.10). Its still in the pre production stage. For CMS, we evaluated JackRabbit and Alfresco. Alfresco (CMIS) was found to be a better choice in our case. What I like about stanbol is the enhancement chains and the set of Enhancement Engines that come by default. This is a small to mid size project.
  • 3) I found this book (Instant Apache Stanbol, Packt Publishing) very practical and useful while going about with my work especially the sections on Entity hubs and Enhancement engines.

A viable option is to use Redlink that offers content analysis and linked data services in the cloud using Apache Stanbol and Apache Marmotta in the back-end.

The Readlink team has worked on IKS and Apache Stanbol ; for these reasons getting in contact with them can be a good starting point when deciding to use these technologies in production environments.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM