简体   繁体   English

从图形中查找路径,然后使用Gremlin计算Azure Cosmos DB中路径发生的次数

[英]Find paths from a graph and then count how many times a path occurs in Azure Cosmos DB using Gremlin

I am storing clickstream events in graph database using the below structure 我使用以下结构将clickstream事件存储在图形数据库中 在此处输入图片说明

User perform multiple events and each event has a edge towards previous event: 用户执行多个事件,每个事件都比上一个事件有优势:

  • Vertices are 'user' and 'event' 顶点是“用户”和“事件”
  • Edges are 'performed' and 'previous' 边缘“执行”和“上一个”

Each event has a property named referer. 每个事件都有一个名为Referer的属性。 For eg, if a user views a page www.foobar.com/aaa then there will be a page view event and it will have referer:www.foobar.com/aaa 例如,如果用户查看页面www.foobar.com/aaa,则将发生页面查看事件,并且该事件将具有引荐来源网址:www.foobar.com/aaa

Now I want to find the possible paths from homepage with their count 现在,我想找到首页的可能路径及其数量

Using the below Gremlin query I am able to find the possible paths, but I am not able to group them to find counts of each path: 使用下面的Gremlin查询,我可以找到可能的路径,但是无法将它们分组以找到每个路径的计数:

g.V().hasLabel('event').has('referer','https://www.foobar.com/').in('previous').in('previous').path().by('referer')

Output: 输出:

 [
      {
        "labels": [
          [],
          [],
          []
        ],
        "objects": [
          "https://www.foobar.com/",
          "https://www.foobar.com/aaa",
          "https://www.foobar.com/bbb"
        ]
      },
      {
        "labels": [
          [],
          [],
          []
        ],
        "objects": [
          "https://www.foobar.com/",
          "https://www.foobar.com/aaa",
          "https://www.foobar.com/bbb"
        ]
      },
      {
        "labels": [
          [],
          [],
          []
        ],
        "objects": [
          "https://www.foobar.com/",
          "https://www.foobar.com/ccc",
          "https://www.foobar.com/ddd"
        ]
      }
    ]

I want an output like this: 我想要这样的输出:

[[
  "https://www.foobar.com/",
  "https://www.foobar.com/aaa",
  "https://www.foobar.com/bbb"
]:2,
[
  "https://www.foobar.com/",
  "https://www.foobar.com/ccc",
  "https://www.foobar.com/ddd"
]:1]

Since I am using azure cosmos graph db only these gremlin operators are available https://docs.microsoft.com/en-us/azure/cosmos-db/gremlin-support 由于我使用的是Azure宇宙图数据库,因此只有这些gremlin运算符可用https://docs.microsoft.com/zh-cn/azure/cosmos-db/gremlin-support
Thanks 谢谢

You can apply groupCount to a path using a syntax such as this: 您可以使用groupCount语法将groupCount应用于path

groupCount().by(path().by('referer'))

So you could rewrite your query as: 因此,您可以将查询重写为:

g.V().hasLabel('event').
      has('referer','https://www.foobar.com/').
      in('previous').
      in('previous').
      groupCount().by(path().by('referer'))

Hope this helps, 希望这可以帮助,

Cheers Kelvin 干杯开尔文

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 Gremlin API 在 Azure Cosmos DB 中创建元图 - How to create a meta graph in Azure Cosmos DB with Gremlin API Azure Cosmos 图数据库支持 Gremlin 和 Tinkerpop 版本 - Azure Cosmos graph db supported Gremlin and Tinkerpop version Azure Cosmos db Gremlin elementMap() - Azure Cosmos db Gremlin elementMap() Azure 用于 gremlin 查询的 Cosmos DB 索引 - Azure Cosmos DB indexing for gremlin queries 如何使用 Azure Cosmos DB Spark 仅将不存在的记录写入 Cosmos DB? - How to write only non existing records to Cosmos DB from using Azure Cosmos DB Spark? 有什么方法可以将数据从 azure 数据块写入 azure cosmos db GREMLIN API - Is there any way to write data from azure databricks to azure cosmos db GREMLIN API 使用 PATH 对 SQL API 中的数据进行 Azure Cosmos DB 分区和索引 - Azure Cosmos DB partitioning and Indexing of Data in SQL API using PATH Azure Cosmos DB Gremlin 是否支持存储过程或自定义函数 - Does Azure Cosmos DB Gremlin supports stored procedures or custom functions 使用 Java SDK 的 Azure Cosmos DB Gremlin/Tinkerpop 令牌身份验证 - Azure Cosmos DB Gremlin/Tinkerpop Token Auth with Java SDK 如何将 map 我的 json 路径从 Azure 数据工厂进入 cosmos db - How to map my json path into cosmos db from Azure Data Factory
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM