In BigQuery, I'd like to replicate the Google Analytics 4 statistic 'engagement rate', which is defined as ( more info ):
sessions with engagement / total sessions
It is required to calculate for all platforms (iOS / Android / Web). In BigQuery, I'm using the default Google Analytics 4 data import tables.
I hereby see various engagement-related parameters, and even the same parameter with different value types, which confuses me a bit:
For the replication of engagement rate, it seems option 2 of the above can't be used, since data is only for web, and I need the calculation to do for also ios and android.
Following 'option 1' or 'option 3' from the above: The output in BigQuery between these options is more or less equal, there is a very little difference. Comparing the output of both options to the output in the GA4 UI, the numbers don't match; in the GA4 UI, they are consistently 3-4% higher for each platform.
Query following 'option 1':
SELECT platform,
SAFE_DIVIDE(COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'session_engaged') = 1 THEN CONCAT(user_pseudo_id,(SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) END),COUNT(DISTINCT CONCAT(user_pseudo_id,(SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')))) AS engagement_rate
FROM `[project id].[dataset id].events_*`
WHERE _table_suffix between '20221008' AND '20221008'
GROUP BY 1
Query following 'option 3':
SELECT platform,
SAFE_DIVIDE(COUNT(DISTINCT CASE WHEN (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'engaged_session_event') = 1 THEN CONCAT(user_pseudo_id,(SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) END),COUNT(DISTINCT CONCAT(user_pseudo_id,(SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')))) AS engagement_rate
FROM `[project id].[dataset id].events_*`
WHERE _table_suffix between '20221008' AND '20221008'
GROUP BY 1
Clear documentation from Google side seem to be missing regarding the parameters session_engaged and engaged_session_event.
I'm looking for more clarity around the following questions:
Does someone knows more about this?
Thanks in advance!
Option 1 it is!
safe_divide(count(distinct case when (select value.string_value from unnest(event_params) where key = 'session_engaged') = '1' then concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) end),count(distinct concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')))) as engagement_rate
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.