简体   繁体   English

如何编写一个 gremlin 查询来查找其他顶点之间的公共顶点并按重叠计数排序返回?

[英]How to write a gremlin query that finds common vertices between other vertices and return sorted by the count of overlap?

I am using python+gremlin to implement my graph queries, but still far from understanding a lot of the concepts, and have encountered an interesting query I don't know to do.我正在使用 python+gremlin 来实现我的图形查询,但还远远没有理解很多概念,并且遇到了一个我不知道该怎么做的有趣查询。

Let's say we have a number of chef vertices with label Chef , ingredient vertices with label Ingredient , and dish vertices Dish .假设我们有许多带有 label Chef的厨师顶点,带有 label Ingredient的成分顶点和菜肴顶点Dish At any given time, a chef can have ingredients at hand to use, indicated with an edge between Chef and Ingredient called has .在任何给定时间,厨师都可以使用手头的食材,用ChefIngredient之间的边缘表示has Dishes have ingredients, indicated with an edge between Dish and Ingredient called uses .菜肴有配料,用DishIngredient之间的边缘表示,称为uses There is also an edge between Chef and Dish indicating if he/she has made it before, called madeBefore . ChefDish之间还有一条边,表示他/她之前是否做过,称为madeBefore

Probably obvious, but there are Dishes that a chef has never made, and not all Dishes use all ingredients, and a chef most likely doesn't have all ingredients.可能很明显,但有些菜肴是厨师从未做过的,并非所有菜肴都使用所有食材,厨师很可能没有所有食材。

I would like to create a query which does the following:我想创建一个执行以下操作的查询:

Get Dishes that the chef has never made, sorted by the dishes that contain the most ingredients that the chef has to make it (if can get the ratio too would be great).获取厨师从未做过的菜肴,按厨师必须制作的成分最多的菜肴分类(如果能得到比例也很棒)。 So the first dishes in the results are ones the chef has never made, and maybe has all the ingredients to make, somewhere in the middle of the results are dishes they have never made and have around half the ingredients needed to make it, and last would be dishes they have never made and also have pretty much none of the ingredients needed to make it.所以结果中的第一道菜是厨师从未做过的菜,也许有所有的原料要做,结果中间的某个地方是他们从未做过的菜,大约有一半的原料需要做,最后将是他们从未做过的菜肴,也几乎没有制作它所需的成分。

The following query will find all dishes that the chef has never made:以下查询将查找厨师从未做过的所有菜肴:

g.V()\
    .hasLabel("Dish")\
    .filter(
        __.not_(
            __.in_("madeBefore").has("Chef", "name", "chefbot1")
        ))\
    .valueMap(True)\
    .toList()

But from here I just have no idea where to start in order to start sorting those dishes based on how many of the ingredients the chef has.但从这里开始,我不知道从哪里开始,以便根据厨师有多少食材开始对这些菜肴进行分类。

My other thinking was to instead query over ingredients and using project to get the counts of edges connecting both the chef and the dish and then filter them in some way then, but I don't know what to do after:我的另一个想法是查询成分并使用project来获取连接厨师和菜肴的边数,然后以某种方式过滤它们,但我不知道之后该怎么做:

g.V()\
    .hasLabel("Ingredient")\
    .filter(
        __.in_("has").has("Chef", "name", "chefbot1"))\
    .project("v", "dishesUsingIngredient")\
    .by(valueMap(True))\
    .by(inE().hasLabel("uses").count())\
    .order().by("dishesUsingIngredient", Order.desc)\
    .toList()

My problem right now with Gremlin is understanding how to chain together more complicated queries, is there anyone that can shine some light on how to solve this kind of problem?我现在对 Gremlin 的问题是了解如何将更复杂的查询链接在一起,有没有人可以对如何解决此类问题有所了解?

If I understood your description you can do something like this:如果我理解您的描述,您可以执行以下操作:

g.V().hasLabel('Dish').
  filter(__.not(__.in('madeBefore').
      has('Chef', 'name', 'chefbot1'))).
  group().by('name').
    by(out('uses').in('has').
      has('Chef', 'name', 'chefbot1').count())
  .order(local).by(values)

example: https://gremlify.com/8w例如: https://gremlify.com/8w

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM