简体   繁体   English

使用Py2neo在Neo4j中创建许多节点

[英]Create many nodes in Neo4j using Py2neo

I have for some time tried to create a lot of notes in py2neo. 我有一段时间尝试在py2neo中创建很多笔记。 The nodes is based on a tweet live-stream, where I want to plot the tweet, information about who tweeted it and Relation between tweets and other nodes (eg. re-tweets, user mentions and tags). 节点基于推文直播,我要在其中绘制推文,有关谁发推文的信息以及推文与其他节点之间的关系(例如,重推,用户提及和标签)。

What I have tried to, is makint a large cypher query, using MERGE to create/get the ID of the user and tweet, and link them. 我试图做的是使用MERGE创建/获取用户ID和推文的ID,并链接它们,以建立大型密码查询。 My idea is to have the following nodes: 我的想法是拥有以下节点:

  • User node 用户节点
  • Tweet 鸣叫
  • Tags 标签
  • Location (For tweet) 位置(鸣叫)
  • Location (For user) 位置(用户)
  • Language 语言
  • Gender 性别
  • TimeZone 时区

And links where I need lookup: 以及我需要查找的链接:

  • Mentions 提及
  • Re-Tweet 转推

This is a lot of writing, and I do something like this: 这是很多写作,我做这样的事情:

statement = "MERGE (tUser:TwitterUser {id:{tuID}}) " \
            "ON CREATE SET " \
            "tUser.displayName = {tdNAME}, " \
            "tUser.summary = {tdSummary}, " \
            "tUser.link = {tdLink}, " \
            "tUser.preferredUsername = {tdPreferredUsername}, " \
            "tUser.account_created = {tdAccount_created}, " \
            "tUser.last_lookup = 'Newer' " \
            "" \
            "MERGE (user:Person {name:{userName}})-[:twitter_acct]->(tUser) " \
            "" \
            "MERGE (gender:Gender {gender: {GENDER}})" \
            "MERGE (user)-[:has_gender]->(gender) " \
            "" \
            "MERGE (user)-[:tweeted]->(tweet:Tweet {id:{tID}}) " \
            "ON CREATE SET " \
            "tweet.type = {tType}, " \
            "tweet.link = {tLink}, " \
            "tweet.body = {tBody}, " \
            "tweet.postedTime = {tPostedTime} " \
            "" \
            "MERGE (timezone:TimeZone {name:{timeZoneName}}) " \
            "MERGE (user)-[:has_time]->(timezone)" \
            "" \
            "MERGE (user)-[:use]->(generator:Generator {name: {generator}}) " \
            "ON CREATE SET " \
            "generator.link = {generatorLink} " \
            "" \
            "MERGE (tweet)-[:tweeted_in]->(tLocation:Location {name: {tLocationName}}) " \
            "MERGE (tLocation)-[:in]->(tCountry:Country {name: {tCountryName}}) " \
            "" \
            "MERGE (user)-[:lives_in]->(uLocation:Location {name: {uLocationName}}) " \
            "" \
            "RETURN user"

The problem is: When I try to insert the tweet in my Neo4J database, it can't follow, also, when I try to just do it with a set of data I have made, it still do it slow. 问题是:当我尝试在我的Neo4J数据库中插入该推文时,它也无法跟随,而且,当我尝试仅使用自己制作的一组数据来执行该推文时,它的执行速度仍然很慢。 I have tried to use batch'es, but still to slow. 我曾尝试使用批处理工具,但仍然很慢。

Would the solusion be to make less noeds, get a better machine or..? 解决方案是减少点头,获得更好的机器或..? use Schema (And how do I get the right ID of eg. a usernode if i restart the service). 使用架构(如果我重新启动服务,如何获得用户节点等的正确ID)。

It's hard to know which direction to point you in without some quantification of "a lot" and "slow". 如果不对“很多”和“慢”进行一定程度的量化,就很难知道将您指向哪个方向。 These are very subjective terms. 这些是非常主观的术语。

Generally speaking, you'll want to make sure that you combine a number of server interactions into a single request, either via a large Cypher transaction or via the (legacy) batch mechanism. 一般而言,您将需要确保通过大型Cypher事务或(旧式)批处理机制将多个服务器交互合并到单个请求中。 It is certainly possible to get quite a lot of performance out of the REST interface, if used cleverly. 如果使用得当,肯定有可能从REST接口中获得很多性能。

Outside of that, you can of course look at something server side: perhaps a Java extension or, for initial loads, one of the bulk import tools. 除此之外,您当然可以在服务器端查看一些东西:也许是Java扩展,或者对于初始加载,是批量导入工具之一。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM