简体   繁体   中英

Graph exploration: does the choice to use incoming edges or outgoing edges affect performance?

I have been tinkering with Graphs for some time, with the objective that I implement appropriate portions of the server-side stack using them. I have used Scala-Graph and Neo4J, and I am learning Spark GraphX. In almost all the applications I have implemented, the model has been that of a Property Graph (Node -> Edge -> Node, with attributes).

When designing the graph (DAGs to be precise), if I spot a strong and directed relationship between two nodes, I set up an edge from one node to one node. This is obvious and intuitive. If a Person likes a Site, an edge with property 'likes' connects them. Thus:


[Nirmalya] -- (Likes) --> [StackOverFlow]

[John] -- (Likes) --> [StackOverFlow]

[Ted] -- (Likes) --> [GoogleGroups ]

[Nirmalya] -- (Likes) --> [Neo4J]


Now, using outgoing edges, I can easily find out which sites Nirmalya likes .

But, when I want to find out who else likes what Nirmalya likes (ie,John), I tend to think that I should create an edge from Site-type Node to Person-type Node also (with property 'isLikedBy'), so that the path is obvious and the traversal is intuitive. Every Person and Site must be connected in both the directions, so that I can reach the other from either to answer queries like this one.


[Nirmalya] -- (Likes) --> [StackOverFlow] -- (IsLikedBy) --> [John]


But from many examples given by experts, I see that this is not prescribed. Instead, this is achieved by making use of operators like incoming . In other words, if two Nodes have an edge set up between them, I don't need to set both the directions of the edge explicitly (just 'likes' is sufficient, 'isLikedBy' is superfluous). Implementation of adjacency matrix makes this possible perhaps but I get a bit confused because I am being allowed to derive a contra-direction even when that direction is not explicit in the DAG.

My question is where is the gap in my understanding? Is it that 'IsLikedBy' direction should ideally be present, but we are optimizing? Alternatively, is it that there can be UseCases where such bidirectional edges are necessary and I need to spot them? Am I completely missing a theoretical underpinning?

I will be glad to become wiser.

I think it depends on the software. I can speak for Neo4j, but not for the other tools that you mentioned ;)

In Neo4j relationships are designed to be traversable both forwards and backwards without a performance cost. This applies both to traversing in the Java APIs as well as using Cypher. You can query both specifying a direction of incoming/outgoing as well as querying for relationships without concern for the direction and it should also be the same performance characteristics.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM