简体   繁体   English

使用 Elasticsearch 进行漏斗查询

[英]Funnel query with Elasticsearch

I'm trying to analyze a funnel using event data in Elasticsearch and have difficulties finding an efficient query to extract that data.我正在尝试使用 Elasticsearch 中的事件数据分析漏斗,但很难找到有效的查询来提取该数据。

For example, in Elasticsearch I have:例如,在 Elasticsearch 中,我有:

timestamp          action        user id
---------          ------        -------
2015-05-05 12:00   homepage      1
2015-05-05 12:01   product page  1
2015-05-05 12:02   homepage      2
2015-05-05 12:03   checkout      1

I would like to extract the funnel statistics.我想提取漏斗统计信息。 For example:例如:

homepage_count  product_page_count  checkout_count
--------------  ------------------  --------------
2             1                  1

Where homepage_count represent the distinct number of users who visited the homepage, product_page_count represents the distinct numbers of users who visited the homepage after visiting the homepage, and checkout_count represents the number of users who checked out after visiting the homepage and the product page.其中,homepage_count 表示访问主页的不同用户数,product_page_count 表示访问主页后访问主页的不同用户数,checkout_count 表示访问主页和产品页面后结帐的用户数。

What would be the best query to achieve that with Elasticsearch?使用 Elasticsearch 实现这一目标的最佳查询是什么?

This can be achieved with a combination of a terms aggregation for the actions and then a cardinality sub-aggregation for the unique user count per action, like below.这可以通过操作的terms聚合和每个操作的唯一用户计数的cardinality子聚合的组合来实现,如下所示。 note that I've also added a range query in case you want to restrict the period to observe:请注意,我还添加了一个range查询,以防您想限制要观察的时间段:

{
  "size": 0,
  "query": {
    "range": {
      "timestamp": {
        "gte": "2021-06-01",
        "lte": "2021-06-07"
      }
    }
  },
  "aggs": {
    "actions": {
      "terms": {
        "field": "action"
      },
      "aggs": {
        "users": {
          "cardinality": {
            "field": "user_id"
          }
        }
      }
    }
  }
}

UPDATE更新

This is a typical case where the scripted_metric aggregation comes in handy.这是scripted_metric聚合派上用场的典型案例。 The implementation is a bit naive, but it shows you the basics of implementing a funnel.该实现有点幼稚,但它向您展示了实现漏斗的基础知识。

POST test/_search
{
  "size": 0,
  "aggs": {
    "funnel": {
      "scripted_metric": {
        "init_script": """
          state.users = new HashMap()
        """,
        "map_script": """
          def user = doc['user'].value.toString();
          def action = doc['action.keyword'].value;
          if (!state.users.containsKey(user)) {
            state.users[user] = [
              'homepage': false,
              'product': false,
              'checkout': false
            ];
          }
          state.users[user][action] = true;
        """,
        "combine_script": """
          return state.users;
        """,
        "reduce_script": """
          def global = [
            'homepage': 0,
            'product': 0,
            'checkout': 0
          ];
          def res = [];
          for (state in states) {
            for (user in state.keySet()) {
              if (state[user].homepage) global.homepage++;
              if (state[user].product) global.product++;
              if (state[user].checkout) global.checkout++;
            }
          }
          return global;
        """
      }
    }
  }
}

The above aggregation will return exactly the numbers you expect, ie:上述聚合将准确返回您期望的数字,即:

  "aggregations" : {
    "funnel" : {
      "value" : {
        "product" : 1,
        "checkout" : 1,
        "homepage" : 2
      }
    }
  }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM