简体   繁体   中英

Gremlin query combine vertices with unrelated vertices CosmosDB

I would like to get several vertices eG with the label "user" combined with vertices, they are not related to, yet eG with the label "movie".

I know, that the strength of Gremlin is traversing the vertex, and combining objects that are not related is not the best use case for the graph. I am using Azure CosmosDB for my application, so if there is any idea how to do this more performant feel free to let me know. If you can do this with gremlin I need some help with the query. I provide an example here:

There are 4 users: bob, jose, frank, peter and 4 movies: movie1, movie2, movie3, movie4

Between the users and movies there can be an edge "watched"

My example data looks as follows:

watched:
[bob, [movie1,movie2]]
[jose, [movie3]]
[frank, []]
[peter, [movie]]

The result and format I would like to get is following:

not watched:
[bob, movie3]
[bob, movie4]
[jose, movie1]
[jose, movie2]
[jose, movie4]
[frank, movie1]
[frank, movie2]
[frank, movie3]
[frank, movie4]
[peter, movie1]
[peter, movie2]
[peter, movie3]

The script to set up the graph (using /partition_key as partition key):

g.addV("user").property("partition_key", 1).property("id", "bob")
g.addV("user").property("partition_key", 1).property("id", "jose")
g.addV("user").property("partition_key", 1).property("id", "frank")
g.addV("user").property("partition_key", 1).property("id", "peter")

g.addV("movie").property("partition_key", 1).property("id", "movie1")
g.addV("movie").property("partition_key", 1).property("id", "movie2")
g.addV("movie").property("partition_key", 1).property("id", "movie3")
g.addV("movie").property("partition_key", 1).property("id", "movie4")

g.V("bob").addE("watched").to(g.V("movie1"))
g.V("bob").addE("watched").to(g.V("movie2"))
g.V("jose").addE("watched").to(g.V("movie3"))
g.V("peter").addE("watched").to(g.V("movie4"))

Please consider, that I cannot use lambdas, because Azure CosmosDB doesn't support them.

A join in gremlin can be realized by repeating the V() step. After realizing that, the gremlin query almost reads as an ordinary SQL query, see below.

g.V().has("id", "bob").addE("watched").to(__.V().has("id", "movie1"))
g.V().has("id", "bob").addE("watched").to(__.V().has("id", "movie2"))
g.V().has("id", "jose").addE("watched").to(__.V().has("id", "movie3"))
g.V().has("id", "peter").addE("watched").to(__.V().has("id", "movie4"))

g.V().hasLabel("user").as("u").
  V().hasLabel("movie").as("m").
  in("watched").where(neq("u")).
  select("u", "m").by("id").
  order().by("u").by("m")

==>[u:bob,m:movie3]
==>[u:bob,m:movie4]
==>[u:frank,m:movie1]
==>[u:frank,m:movie2]
==>[u:frank,m:movie3]
==>[u:frank,m:movie4]
==>[u:jose,m:movie1]
==>[u:jose,m:movie2]
==>[u:jose,m:movie4]
==>[u:peter,m:movie1]
==>[u:peter,m:movie2]
==>[u:peter,m:movie3]

You are right in saying that this query does not perform well in gremlin and I would advise you to use the SQL API of CosmosDb.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM