简体   繁体   English

BigQuery 中的 GA4 数据:如何复制参与率?

[英]GA4 data in BigQuery: How to replicate the engagement rate?

In BigQuery, I'd like to replicate the Google Analytics 4 statistic 'engagement rate', which is defined as ( more info ):在 BigQuery 中,我想复制 Google Analytics 4 统计“参与率”,它被定义为(更多信息):

sessions with engagement / total sessions参与会话数/总会话数

It is required to calculate for all platforms (iOS / Android / Web).需要为所有平台(iOS / Android / Web)计算。 In BigQuery, I'm using the default Google Analytics 4 data import tables.在 BigQuery 中,我使用默认的 Google Analytics 4 数据导入表。

I hereby see various engagement-related parameters, and even the same parameter with different value types, which confuses me a bit:我在这里看到了各种与参与相关的参数,甚至是具有不同值类型的相同参数,这让我有点困惑:

  1. parameter 'session_engaged';参数'session_engaged'; for all session_start events, this parameter is included as type integer. The parameter only is added when session_engaged = 1;对于所有 session_start 事件,此参数包含为类型 integer。仅在 session_engaged = 1 时添加该参数; data for all platforms (ios, android, web).所有平台的数据(ios、android、web)。
  2. parameter 'session_engaged';参数'session_engaged'; for all events except session_start events, this parameter is included as type string.对于除 session_start 事件之外的所有事件,此参数作为字符串类型包含在内。 The parameter is included in 100% of all existing (web) events with either the value '0' or '1';该参数包含在 100% 的所有现有(网络)事件中,值为“0”或“1”; data is ONLY available for platform = 'web'.数据仅适用于平台 = 'web'。
  3. parameter 'engaged_session_event';参数'engaged_session_event'; this parameter is included (only) as type integer;此参数(仅)作为类型 integer 包含在内; the parameter is only included in an event when value = 1. data for all platforms (ios, android, web)该参数仅在值 = 1 时包含在事件中。所有平台的数据(ios、android、web)
  4. And then there is also the parameter engagement_time_msec;然后还有参数engagement_time_msec; I didn't use this parameter in the scope of this post, since I still doubt the validity of this parameter (see also an earlier post in which I questioned the parameter values)我没有在这篇文章的 scope 中使用这个参数,因为我仍然怀疑这个参数的有效性(另请参阅我质疑参数值的早期帖子

For the replication of engagement rate, it seems option 2 of the above can't be used, since data is only for web, and I need the calculation to do for also ios and android.对于参与率的复制,似乎不能使用上面的选项 2,因为数据仅适用于 web,我还需要计算 ios 和 android。

Following 'option 1' or 'option 3' from the above: The output in BigQuery between these options is more or less equal, there is a very little difference.按照上面的“选项 1”或“选项 3”:这些选项之间 BigQuery 中的 output 大致相等,差异很小。 Comparing the output of both options to the output in the GA4 UI, the numbers don't match;将两个选项的 output 与 GA4 UI 中的 output 进行比较,数字不匹配; in the GA4 UI, they are consistently 3-4% higher for each platform.在 GA4 用户界面中,它们在每个平台上始终高出 3-4%。

Query following 'option 1':查询以下“选项 1”:

SELECT platform,
SAFE_DIVIDE(COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = 1 THEN CONCAT(user_pseudo_id,(SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) END),COUNT(DISTINCT CONCAT(user_pseudo_id,(SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')))) AS engagement_rate
FROM `[project id].[dataset id].events_*`
WHERE _table_suffix between '20221008' AND '20221008'
GROUP BY 1

Query following 'option 3':查询以下“选项 3”:

SELECT platform,
SAFE_DIVIDE(COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'engaged_session_event') = 1 THEN CONCAT(user_pseudo_id,(SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) END),COUNT(DISTINCT CONCAT(user_pseudo_id,(SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')))) AS engagement_rate
FROM `[project id].[dataset id].events_*`
WHERE _table_suffix between '20221008' AND '20221008'
GROUP BY 1

Clear documentation from Google side seem to be missing regarding the parameters session_engaged and engaged_session_event. Google 方面似乎缺少关于参数 session_engaged 和 engaged_session_event 的清晰文档。

I'm looking for more clarity around the following questions:我正在寻找有关以下问题的更多信息:

  1. What does each parameter really means, what is the context around the values of each parameter and what are the differences between them?每个参数的真正含义是什么,每个参数值的上下文是什么,它们之间有什么区别?
  2. In which case which parameter should be used.在这种情况下应该使用哪个参数。
  3. How to calculate the 'engagement rate' in BigQuery, and replicate the numbers as displayed in the GA4 UI.如何计算 BigQuery 中的“参与率”,并复制 GA4 用户界面中显示的数字。

Does someone knows more about this?有人对此了解更多吗?

Thanks in advance!提前致谢!

Option 1 it is!选项1是!

safe_divide(count(distinct case when (select value.string_value from unnest(event_params) where key = 'session_engaged') = '1' then concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) end),count(distinct concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')))) as engagement_rate

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM