简体   繁体   English

neo4j(密码)非常慢

[英]neo4j (cypher) is very slow

I am trying to build a graph of different entities liked by people on Facebook to create a basic cross domain recommendation engine. 我正在尝试建立人们在Facebook上喜欢的不同实体的图表,以创建基本的跨域推荐引擎。

I have got data for different entities (movies, books, music, etc). 我有不同实体(电影,书籍,音乐等)的数据。 Nodes are created for each item with properties as name of the item (name of the movie, book, etc) and entity type of the item (movie, book, etc). 将为每个项目创建节点,这些节点具有作为项目名称(电影,书籍等的名称)和项目的实体类型(电影,书籍等)的属性。 Any two nodes have relationships between them called "affinity". 任何两个节点之间都有称为“亲和力”的关系。 This relationship also has a "strength" property, which is equal to the no. 此关系还具有“强度”属性,该属性等于“否”。 of people who have liked these two items. 喜欢这两个项目的人群。

I use FB users to connect these nodes. 我使用FB用户连接这些节点。 FB users also are nodes in the graph with properties as name of the person and type as person. FB用户也是图中的节点,其属性为人员名称,类型为人员。 The relationship between these nodes and item nodes is called 'likes'. 这些节点与项目节点之间的关系称为“喜欢”。 Now if a person has liked a movie, I would like to recommend him books or music by traversing the graph. 现在,如果某人喜欢电影,我想通过遍历图表来推荐他的书籍或音乐。 This is the cypher query I am trying to traverse the graph: 这是我试图遍历图的密码查询:

START root = node(<LIKED_MOVIE_NODE_ID>)
MATCH p = root-[rel1:affinity*..3]-(movies)<-[rel2:likes]-(persons)-[rel3:likes]->(books)
WHERE HAS(movies.type) and movies.type = "movies" and HAS(persons.type) and persons.type = "person" and HAS(books.type) and books.type = "books"
RETURN books

This runs very slow, sometimes taking upto 500 secs. 这运行非常慢,有时需要长达500秒的时间。 I have got some 13000 movies, 2000 books and 3000 music nodes. 我有大约13000部电影,2000本书和3000个音乐节点。 Connecting them are 16000 people. 连接他们的是16000人。 All together there are some 300,000 relationships. 总共有300,000种关系。

My questions are : 我的问题是:

  1. Am I doing something wrong? 难道我做错了什么? Is there a better way to do this? 有一个更好的方法吗? I am new to neo4j. 我是neo4j的新手。 I have tried some of the techniques for tuning the neo4j graphDB. 我尝试了一些调整neo4j graphDB的技术。 I have increased the min heap size to 4 GB and am running it on a 8 core machine with 32 GB RAM. 我已将最小堆大小增加到4 GB,并在具有32 GB RAM的8核计算机上运行它。

  2. I want to know the strength of the relationships rel1 and number of rel2 and rel3. 我想知道关系rel1的强度以及rel2和rel3的数量。 Rel1 has got a property strength. Rel1具有财产实力。 I am not able to find it out, 我找不到它,

Please advise as I am on the verge of giving up neo4j and going back to SQL. 请提出建议,因为我即将放弃neo4j并返回到SQL。 Atleast it works. 至少可以正常工作。 :( :(

Regds, Paritosh Regito,Paritosh

Cypher is slow. 密码很慢。 Actually very slow when compare to the traversal and core API ( http://java.dzone.com/articles/get-full-neo4j-power-using ) 与遍历和核心API相比,实际上非常慢( http://java.dzone.com/articles/get-full-neo4j-power-using

That said, you could try to limit the amount of nodes neo4j processes, by splitting up your Match into different WITH clauses. 就是说,您可以尝试通过将Match拆分为不同的WITH子句来限制neo4j进程的节点数量。 Depending on your usecase you could for example put the root-[rel1:affinity*..3]-(movies) in a seperate clause, and filter out the distinct movies. 例如,根据您的用例,您可以将root- [rel1:affinity * .. 3]-(电影)放在单独的子句中,并过滤出不同的电影。 Else neo4j will process all combinations of paths which lead to a movie. 其他neo4j将处理通往电影的所有路径组合。

PS: PS:

WHERE HAS(movies.type) and movies.type = "movies" and HAS(persons.type) and persons.type = "person" and HAS(books.type) and books.type = "books"

can be rewritten as 可以改写成

WHERE movies.type! = "movies" and persons.type! = "person" and books.type! = "books"

Or if you are using neo4j 2.0.0M4 you can just skip the HAS() 或者,如果您使用的是neo4j 2.0.0M4,则可以跳过HAS()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM