简体   繁体   中英

Are RDF triple Stores, suitable for everyday programming?

I like these a lot, and would like to use them for everything. Why would doing RDF triple stores for everyday programming (Entities/Tables), such as Contacts, Customers, Company etc. be a bad idea.

Are there short falls in the technology that I have not come across. I was concerned about retrieving data, but I think this is covered with SPARQL.

There are no one-size-fits-it-all tools. Triple stores are appropriate and usable today for some kinds of tasks and not for others.

A similar question was asked on semanticoverflow.com and the common answer was the same: "use whatever is appropriate".

Query times tend to be much slower than for conventional DBs, even with simple queries. Also, many RDF stores don't support standard DB features like transactions, crash recovery, ...

One of the shortcomings we have come across in using RDF triple stores for general programming is that most engines don't support aggregation in queries (min, max, group by).

A checklist we use to decide between RDBMS is the following

RDBMS if

  • static schema
  • very large amount of data
  • no RDF export needed
  • Lucene support needed (easy via Hibernate Search for example)
  • strong data consistency requirements (money involved etc)

RDF if

  • not fixed or dynamic schema
  • small to large amount of data
  • RDF export needed
  • loose data consistency requirements

Refactoring RDFBMS schemas for ongoing projects can be quite an overhead if you don't have the correct tools.

Lucene support is provided by some RDF engines as well, but is not as well documented and supported as in the case of Hibernate Search.

Scalability of RDF engines is also improving steadily, where ideas of the NoSQL side are incorporated into RDF engines, but if you go with the standard engines of Jena and Sesame, this division is still quite valid.

One issue that has not been mentioned yet is that updating triplestores to reflect changing data is often more work than for a RDBMS or OODBMS, because there is no notion of an 'object' or 'row' - only triples and resources. Deleting a domain object therefore requires care or you will end up with a lot of garbage left in the triplestore. The absence of cascading deletes is a closely-related issue.

On the plus side, RDF can be helpful even for everyday applications because you can flexibly add new subclasses, relationships or sub-relationships between entities without necessarily breaking any code, and easily add annotations, comments etc to resources.

Further to Peteris's answer there are some key differences between how you model data for a Triple Store vs other techniques like OOP, relational databases, XML eg rows, classes, properties etc

It very much depends what you want to do whether they are appropriate and whether you can find one with the right performance characteristics for your application.

People have a tendency to characterise triple-stores as being schema-less databases but realistically unless you are using some form of schema/ontology then they aren't particularly useful. If you want to use SPARQL to get stuff out then there needs to be some schema patterns in the store that you can write queries against.

Personally I would still use relational databases for a lot of things and still do, while I'm using RDF and triple stores for an increasing amount of stuff that doesn't mean I'm ready to throw out what works well.

As a final point even if you go with a relational database for the time being there are technologies like DB2RDF which can convert relational databases to RDF so you can stick with a DB for now and then export your database to RDF in the future as desired

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM