[英]Funnel query with Elasticsearch
I'm trying to analyze a funnel using event data in Elasticsearch and have difficulties finding an efficient query to extract that data.我正在尝试使用 Elasticsearch 中的事件数据分析漏斗,但很难找到有效的查询来提取该数据。
For example, in Elasticsearch I have:例如,在 Elasticsearch 中,我有:
timestamp action user id
--------- ------ -------
2015-05-05 12:00 homepage 1
2015-05-05 12:01 product page 1
2015-05-05 12:02 homepage 2
2015-05-05 12:03 checkout 1
I would like to extract the funnel statistics.我想提取漏斗统计信息。 For example:
例如:
homepage_count product_page_count checkout_count
-------------- ------------------ --------------
2 1 1
Where homepage_count represent the distinct number of users who visited the homepage, product_page_count represents the distinct numbers of users who visited the homepage after visiting the homepage, and checkout_count represents the number of users who checked out after visiting the homepage and the product page.其中,homepage_count 表示访问主页的不同用户数,product_page_count 表示访问主页后访问主页的不同用户数,checkout_count 表示访问主页和产品页面后结帐的用户数。
What would be the best query to achieve that with Elasticsearch?使用 Elasticsearch 实现这一目标的最佳查询是什么?
This can be achieved with a combination of a terms
aggregation for the actions and then a cardinality
sub-aggregation for the unique user count per action, like below.这可以通过操作的
terms
聚合和每个操作的唯一用户计数的cardinality
子聚合的组合来实现,如下所示。 note that I've also added a range
query in case you want to restrict the period to observe:请注意,我还添加了一个
range
查询,以防您想限制要观察的时间段:
{
"size": 0,
"query": {
"range": {
"timestamp": {
"gte": "2021-06-01",
"lte": "2021-06-07"
}
}
},
"aggs": {
"actions": {
"terms": {
"field": "action"
},
"aggs": {
"users": {
"cardinality": {
"field": "user_id"
}
}
}
}
}
}
UPDATE更新
This is a typical case where the scripted_metric
aggregation comes in handy.这是
scripted_metric
聚合派上用场的典型案例。 The implementation is a bit naive, but it shows you the basics of implementing a funnel.该实现有点幼稚,但它向您展示了实现漏斗的基础知识。
POST test/_search
{
"size": 0,
"aggs": {
"funnel": {
"scripted_metric": {
"init_script": """
state.users = new HashMap()
""",
"map_script": """
def user = doc['user'].value.toString();
def action = doc['action.keyword'].value;
if (!state.users.containsKey(user)) {
state.users[user] = [
'homepage': false,
'product': false,
'checkout': false
];
}
state.users[user][action] = true;
""",
"combine_script": """
return state.users;
""",
"reduce_script": """
def global = [
'homepage': 0,
'product': 0,
'checkout': 0
];
def res = [];
for (state in states) {
for (user in state.keySet()) {
if (state[user].homepage) global.homepage++;
if (state[user].product) global.product++;
if (state[user].checkout) global.checkout++;
}
}
return global;
"""
}
}
}
}
The above aggregation will return exactly the numbers you expect, ie:上述聚合将准确返回您期望的数字,即:
"aggregations" : {
"funnel" : {
"value" : {
"product" : 1,
"checkout" : 1,
"homepage" : 2
}
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.