简体   繁体   English

ArangoDB文档数据库还有图形数据库?这怎么可能?

[英]ArangoDB document database and also a graph database? How is it possible?

Someone could explain how a document database works as a graph database also? 有人可以解释文档数据库如何作为图形数据库工作吗?

What is the difference between ArangoDB and Neo4j? ArangoDB和Neo4j有什么区别?

Disclaimer: I am Max from ArangoDB, one of the core developers. 免责声明:我是来自ArangoDB的Max,核心开发人员之一。

First of all, a longer discussion of this and other related questions can be found in my article Graphs in data modeling - is the emperor naked? 首先,在我的文章“ 数据建模中的图形”中可以找到关于这个问题和其他相关问题的更长时间的讨论- 皇帝裸体吗? , but I will try to answer both questions concisely here. ,但我会在这里简明扼要地回答这两个问题。

(1) Storing a graph in a document store is relatively easy (as it is in a relational database), one can for example simply store a document for each vertex in a "vertex-collection" and a document for each edge in an "edge-collection". (1)在文档存储中存储图形是相对容易的(因为它在关系数据库中),例如,可以简单地将每个顶点的文档存储在“顶点集合”中,并且将文档存储在“顶点集合”中。边收集”。 One only has to make sure that each edge stores from which vertex it comes and to which vertex it goes. 人们只需要确保每个边缘存储它来自哪个顶点以及它到达哪个顶点。 In ArangoDB, we use the _from and _to attributes in the edge document for this. 在ArangoDB中,我们在边缘文档中使用_from和_to属性。

However, the crucial capability for a graph database is that it needs to answer queries about graphs efficiently. 但是,图形数据库的关键功能是它需要有效地回答有关图形的查询。 Typical queries for graphs are (a) "what are the neighbors of a vertex in the graph?" 图的典型查询是(a)“图中顶点的邻居是什么?” or (b) "what is the shortest path from vertex A to vertex B in the graph?" 或(b)“图中从顶点A到顶点B的最短路径是什么?” or (c) "give me all vertices I can reach from vertex A by following edges". 或(c)“给我所有我可以通过跟随边缘从顶点A到达的顶点”。 Whereas (a) simply needs a good index on the edge collection, (b) and (c) involve an a priori unknown number of steps in the graph. (a)只需要边缘集合上的良好索引,(b)和(c)涉及图中先验未知数量的步骤。 Therefore, (b) and (c) cannot be done efficiently with traditional database query languages like SQL, simply because they would involve a large amount of communication between client and server, or at the very least a very complicated expression with a variable number of joins. 因此,使用传统的数据库查询语言(如SQL)无法有效地完成(b)和(c),因为它们会涉及客户端和服务器之间的大量通信,或者至少是一个具有可变数量的非常复杂的表达式。连接。 I call queries like (b) and (c) therefore "graphy", without defining this rigorously. 因此,我将(b)和(c)之类的查询称为“图形”,而不是严格定义。

Therefore, my short answer to "how can a document store be a graph database?" 因此,我简单回答“文档存储如何成为图形数据库?” is: Store the graph as above and implement graphy queries in the database server, accessible from the query language of the data store. is:如上所示存储图形并在数据库服务器中实现图形查询,可从数据存储的查询语言访问。 In principle, the same could be done with a relational database and some considerable extensions to SQL. 原则上,可以使用关系数据库和SQL的一些可观的扩展来完成相同的操作。

With ArangoDB we have managed to combine the document, the graph and the key/value features into a single, coherent query language. 通过ArangoDB,我们设法将文档,图形和键/值功能组合成一个统一的查询语言。 We therefore call ArangoDB a "multi-model database", because it combines these three data models seamlessly. 因此,我们将ArangoDB称为“多模型数据库”,因为它无缝地组合了这三种数据模型。 You can even mix the data models in a single query! 您甚至可以在单个查询中混合数据模型!

This leads over to my answer to question (2), which is obviously a bit biased: 这导致了我对问题(2)的回答,这显然有点偏颇:

In comparison to ArangoDB, which is a distributed multi-model database in the above sense, Neo4j is a classical graph database. 与ArangoDB(上述意义上的分布式多模型数据库)相比,Neo4j是一个经典的图形数据库。 It stores graphs, allows to query them with "graphy queries" and has a storage and query engine that is optimised for that. 它存储图形,允许使用“图形查询”查询它们,并具有针对此进行优化的存储和查询引擎。 Neo4j is particularly good at matching paths using its builtin query language cypher. Neo4j特别擅长使用内置查询语言cypher匹配路径。 It does allow to attach properties to vertices and to edges, but it is not a full featured document store. 它允许将属性附加到顶点和边缘,但它不是一个功能齐全的文档存储。 It is not optimised to handle document queries using multiple secondary indexes nor does it do joins. 它没有针对使用多个二级索引处理文档查询进行优化,也没有进行连接。 Furthermore, Neo4j is not distributed. 此外,Neo4j不是分发的。

Neo4j is written in Java, ArangoDB is written in C++ and embeds Google's V8 to execute JavaScript extensions. Neo4j是用Java编写的,ArangoDB是用C ++编写的,嵌入了Google的V8来执行JavaScript扩展。

For a performance comparison see this post . 有关性能比较,请参阅此文章

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM