简体   繁体   中英

Labels, properties, or nodes? Cypher/Neo4j

I can't quite tell if this is a bad question, but I think it has a definitive answer...

I'm work on building my first graph database. It will hold nodes that are references to content. These nodes will be connected to term nodes. Each term node can be one of about seven types (Person, Organization, Jargon etc).

What is the best way to implement the types of terms in the database as it relates to query speed? Users will search for content based on the terms and the idea is to allow them to filter the terms based on their types.

As a property seems out of the question as it would require accessing a JSON object for every term during a query.

(contentNode:content)-[:TAGGED_WITH]-(termNode:term {type: {"people":false,"organizations":false,"physicalObjects":true,"concepts":true,...}}

Labels intuitively make sense to me as the different types really are just labeling the term nodes more specifically. Each term node could have the label 'term' as well as the relevant types. I have some confusion about this, but it seems labels cannot be used as dynamic properties in a cypher query as it prevents the query from being cached/properly indexed.

(contentNode:content)-[:TAGGED_WITH]-(termNode:term:physicalObject:jargon:...)

The last option I can think of would be to have a node for each of the term 'types' and connect the term to the relevant type nodes. Right now this is seeming like the best option (despite being the most verbose).

(contentNode:content)-[:TAGGED_WITH]-(termNode:term)-[:IS_TYPE]-(typeNode:termType {name:jargon}), (termNode:term)-[:IS_TYPE]-(typeNode:termType {name:physical object}), (termNode:term)-[:IS_TYPE]-(typeNode:termType {name: ...})

Can anyone with more experience/knowledge weigh in on this? Thanks a lot.

I'm not sure I completely understand what you're trying to do but I wanted to answer a few of the points and then maybe you can elaborate:

but it seems labels cannot be used as dynamic properties in a cypher query as it prevents the query from being cached/properly indexed.

Using dynamic labels won't have an impact on indexing but you're partially write about the caching. The cypher parser keeps a cache of queries that it's seen before so that it doesn't have to regenerate the query plan each time. Given that you only have a limited number of labels it wouldn't take long until you've cached all combinations anyway.

I would suggest trying out the various models with a subset of your data and measure the query time & query readability for each.

Mark

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM