简体   繁体   中英

Best approach for graph nodes relationship in Neo4J (Cypher)

I'm trying to achieve the following data structure in my Neo4J graph database:

(User)-[:isEmployedBy]->(Company) Whereas each "employment" can have multiple "transactions"

Im considering the following options and would like to hear which would be the most "future proof":

Simplest:

create
  (matt:Person { name: 'Matt' } ),
  (stackoverflow:Company { name: 'Stackoverflow' }),
  (matt)-[:employed_by { from: date("2000-01-01"), until: date("2010-01-01") }]->(stackoverflow),
  return *

However, I couldn't add additional relationships to my employment (such as transactions). I assume my second approach to be my only option? Is this correct? (see below)

create
  (matt:Person { name: 'Matt' } ),
  (stackoverflow:Company { name: 'Stackoverflow' }),
  (employment:Employment, { from: date("2000-01-01"), until: date("2010-01-01")}),
  (t1:Payment, { amount: 100 }),
  (t2:Payment, { amount: 50 }),
  (employment)-[:received]->(t1)
  (employment)-[:received]->(t2)
  return *

I understand I could attach those transactions directly to the person but I need them directly associated with the job employment, as in: If the person loses the job (connection) all transactions need to disappear.

3) I could also do both connections:

create
  (matt:Person { name: 'Matt' } ),
  (stackoverflow:Company { name: 'Stackoverflow' }),
  (employment:Employment, { from: date("2000-01-01"), until: date("2010-01-01")}),
  (matt)-[:employed_by { from: date("2000-01-01"), until: date("2010-01-01") }]->(stackoverflow),
  (matt)-[:has_employment]->(employment)<-[:has_employment]-(stackoverflow)
  return *

While I might run into inconsistent data (dates here) would this approach give me a query performance benefit, if say I only wanted to see who was employed by whom without more details/transactions? (using: employed_by).

General question: Do I want to (or need to) setup bi-directional connections?

create
  (matt:Person { name: 'Matt' } ),
  (stackoverflow:Company { name: 'Stackoverflow' }),
  (matt)-[:employed_by { from: date("2000-01-01"), until: date("2010-01-01") }]->(stackoverflow),
  (matt)<-[:employs { from: date("2000-01-01"), until: date("2010-01-01") }]-(stackoverflow),
  return *

Again, I'd end up with duplicate information - is there a benefit to this at all?

Thanks for any hints!

Option #3 is close to what I'd recommend, except that:

  1. The :employed_by relationship is redundant and should be omitted to avoid wasting storage, way overcomplicating some queries, and carrying the risk of inconsistencies. The Employment node contains the same information.
  2. I would avoid using the same relationship type on both sides of the Employment node, to avoid confusion and to potentially make future queries more efficient.

Also, in the standard naming convention used by neo4j, relationship names are in all uppercase (eg, "HAS_EMPLOYMENT"). It actually helps to make reading your Cypher code easier.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM