简体   繁体   English

以最佳方式存储分层数据:NoSQL 或 SQL

[英]Store Hierarchical data in a best way: NoSQL or SQL

I am working hierarchical data, as in the tree structure.我正在处理分层数据,就像在树结构中一样。 i want to know what is the best way to store them in database.我想知道将它们存储在数据库中的最佳方法是什么。

I started with adjacency list, in MySQL.我从 MySQL 中的邻接表开始。 But the performance seems to dip as the data is increasing.但随着数据的增加,性能似乎有所下降。 I have around 20,000 rows stored in a MySQL table with parent child relationship and will increase in future.我在具有父子关系的 MySQL 表中存储了大约 20,000 行,并且将来会增加。 Fetching data is taking very long time as I have to write many self joins depending upon the depth of the tree.获取数据需要很长时间,因为我必须根据树的深度编写许多自连接。

So I was searching for best way to store this kind of data.所以我一直在寻找存储这种数据的最佳方式。 In once place I found Nested Sets is better way than adjacency lists.有一次我发现嵌套集比邻接列表更好。 Then I was advised to look upon NoSQL, if that would solve my problem.然后我被建议看看 NoSQL,如果这能解决我的问题。 So I am confused now whether to remain in SQL or go into No SQL or if there is any other best way to handle this kind of data.所以我现在很困惑是留在 SQL 中还是进入 No SQL,或者是否有任何其他最好的方法来处理这种数据。

So can anyone suggest me what is the best way??那么任何人都可以建议我什么是最好的方法?

If MySQL is giving you more troubles than it solves, I'd take a look at MongoDB, CouchDB or ElasticSearch (depending on your use case).如果 MySQL 给您带来的麻烦比它解决的麻烦多,我会看看 MongoDB、CouchDB 或 ElasticSearch(取决于您的用例)。 Maybe even Neo4j.甚至可能是 Neo4j。 Your choice should come down to several points such as replication, scaling capacity, consistency... I advise you to read carefully some official documentations before you decide.你的选择应该归结为几个点,比如复制、扩容、一致性……我建议你在决定之前仔细阅读一些官方文档。 Here's a starting point for comparison.这是比较的起点

Going NoSQL will get rid of all the joins and improve your performance but you'll still need to implement a proper hierarchy using adjacency list, nested sets, materialized paths and such...使用 NoSQL 将摆脱所有连接并提高性能,但您仍然需要使用邻接列表、嵌套集、物化路径等来实现适当的层次结构……

Keep in mind NoSQL technologies above pretty much all use eventual consistency, which essentially means that your data might not be consistent at a given time among some nodes.请记住,上面的 NoSQL 技术几乎都使用最终一致性,这实质上意味着您的数据在给定时间在某些节点之间可能不一致。 If this is a problem you should stick to RDBMS.如果这是一个问题,您应该坚持使用 RDBMS。

Postgres has native support for it, using ltree : Postgres 对它有本地支持,使用ltree

-- Ltree type presentation
-- Farshid Ashorui

-- First of all, this is an extension (included with standard installation)
CREATE EXTENSION IF NOT EXISTS ltree;

-- We need to specify `ltree` type.
CREATE TABLE IF NOT EXISTS tree(
    id serial primary key,
    letter char,
    path ltree
);


-- we are using `gist` index for super fast indexing of the path.
-- read more here: http://patshaughnessy.net/2017/12/15/looking-inside-postgres-at-a-gist-index
-- This is Postgres’s GiST index API to find and match descendant nodes
CREATE INDEX IF NOT EXISTS tree_path_idx ON tree USING GIST (path);


-- @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

-- Root of heirarchy
insert into tree (letter, path) values ('A', 'A');
insert into tree (letter, path) values ('B', 'A.B');
-- Notice here, we are deviating 
insert into tree (letter, path) values ('C', 'A.C');
insert into tree (letter, path) values ('D', 'A.C.D');
insert into tree (letter, path) values ('E', 'A.C.E');
insert into tree (letter, path) values ('F', 'A.C.F');
-- Back to B path
insert into tree (letter, path) values ('G', 'A.B.G');





-- @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
-- Search for A.C path
select * from tree where path <@ 'A.C';
-- More advanced one:
select * from tree where strpos(path::varchar, 'A.B.G') = 1;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM