简体   繁体   English

使用 Clickhouse 计算跳出率

[英]Calculate bounce rate with Clickhouse

I am trying to use Clickhouse for a small analytics app of mine and I have a table that records raw hits as:我正在尝试将 Clickhouse 用于我的一个小型分析应用程序,并且我有一个表格,将原始点击记录为:

CREATE TABLE hits (
  sessionId LowCardinality(String),
  page LowCardinality(String),
  timestamp DateTime,
  projectId UInt16
) ENGINE = MergeTree() PARTITION BY toYYYYMM(timestamp)
ORDER BY (projectId, page, toStartOfHour(timestamp)) --
  SETTINGS index_granularity = 8192;

Afterwards I can add some sample data as:之后我可以添加一些示例数据:

sessionId page    timestamp            projectId 
xxx       /       2021-03-12 13:51:12  1         
yyy       /       2021-03-12 13:51:12  1         
xxx       /cool   2021-03-12 13:52:12  1         
fff       /       2021-03-12 13:53:12  1                 

What I am trying to achieve is calculating bounces (unique sessionId occurunce) and views per page, something like:我想要实现的是计算跳出(唯一的 sessionId 发生)和每页的浏览量,例如:

page   bounces views projectId
/      2       3     1
/cool  0       1     1

I can easily count the views per page but the unique sessionId counting is failing due to the GROUP BY clause:我可以轻松计算每页的浏览量,但由于GROUP BY子句,唯一sessionId计数失败:

SELECT page,
  projectId,
  count(*) as views,
  count(DISTINCT sessionId) as bounces --fail
from hits
GROUP BY (page, projectId);

Any ideas, workarounds on changing the Clickhouse schema or even using some of engine from Clickhouse for aggregation would be highly appreciated.任何关于更改 Clickhouse 架构甚至使用 Clickhouse 的某些引擎进行聚合的想法、解决方法都将受到高度赞赏。

check https://clickhouse.tech/docs/en/sql-reference/aggregate-functions/parametric-functions/#function-sequencecount检查https://clickhouse.tech/docs/en/sql-reference/aggregate-functions/parametric-functions/#function-sequencecount

select projectId, p[1] page, countIf(length(p)=1) bounce
from (
SELECT 
  projectId, sessionId,
  groupArray( page ) p
from hits
GROUP BY sessionId, projectId )
group by projectId,page

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM