[英]GA4 traffic source data do not match with bigquery
I have try to export traffic source data and event attribtion from bigquery and match with GA4 (session_source and session_medium) I am extract the event params (source ad medium) from bigquery but have a big gap between two data source我尝试从 bigquery 导出流量源数据和事件归因并与 GA4(session_source 和 session_medium)匹配 我从 bigquery 中提取事件参数(源广告媒体)但两个数据源之间存在很大差距
Any solution to solve it?有解决办法吗?
I have try to use use below SQL我尝试在 SQL 下面使用
with prep as (
select
user_pseudo_id,
(select value.int_value from unnest(event_params) where key = 'ga_session_id') as session_id,
max((select value.string_value from unnest(event_params) where key = 'source')) as source,
max((select value.string_value from unnest(event_params) where key = 'medium')) as medium,
max((select value.string_value from unnest(event_params) where key = 'name')) as campaign,
max((select value.string_value from unnest(event_params) where key = 'term')) as term,
max((select value.string_value from unnest(event_params) where key = 'content')) as coXXntent,
platform,
FROM `XXX`
group by
user_pseudo_id,
session_id,
platform
)
select
-- session medium (dimension | the value of a medium associated with a session)
platform,
coalesce(source,'(none)') as source_session,
coalesce(medium,'(none)') as medium_session,
coalesce(campaign,'(none)') as campaign_session,
coalesce(content,'(none)') as content,
coalesce(term,'(none)') as term,
count(distinct concat(user_pseudo_id,session_id)) as sessions
from
prep
group by
platform,
source_session,
medium_session,
campaign_session,
content,
term
order by
sessions desc
I'm also trying to figure out why BigQuery can't correctly match the source and medium to the event.我也在尝试弄清楚为什么 BigQuery 无法正确匹配事件的来源和媒介。 The issue I found is that it assigns the source/medium as google/organic even though there is a gclid parameter in the link.
我发现的问题是,即使链接中有 gclid 参数,它也会将源/媒体指定为 google/organic。 The second issue is the huge deficiencies in recognizing the source as direct - in such cases I do not have these parameters for events at all.
第二个问题是在将来源识别为直接来源方面存在巨大缺陷——在这种情况下,我根本没有这些事件参数。
The values are valid, but only for the source and medium that acquired the user.这些值是有效的,但仅适用于获取用户的来源和媒体。
As I compare data in UA and GA4 session attribution is correct.当我比较 UA 和 GA4 中的数据时,会话归因是正确的。 So it looks like a problem when exporting to BigQuery.
所以在导出到 BigQuery 时看起来像是一个问题。 I reported this to support and am waiting for a response.
我将此事报告给支持部门,正在等待回复。
I have also noticed source/medium does not align between BigQuery and GA4 and like Justyna has commented a lot of my source/medium come through as google/organic even when they are not.我还注意到来源/媒体在 BigQuery 和 GA4 之间不一致,就像 Justyna 评论的那样,我的很多来源/媒体都是通过 google/organic 获得的,即使它们不是。 I am hoping Justyna will post here when there is a solution.
我希望 Justyna 在有解决方案时会在这里发布。
Looking at your code I can see 2 other areas that would cause discrepancies查看您的代码,我可以看到另外 2 个会导致差异的区域
1) 1)
count(distinct concat(user_pseudo_id,session_id)) as sessions
This will only capture events with a valid pseudo_id and session_id, this is the correct way to count, but in my data there tends to be a few events without the ids are null so your session count included them but GA4 does.so use your preferred method of counting nulls to work out if this is an issue for you.这将只捕获具有有效 pseudo_id 和 session_id 的事件,这是正确的计数方式,但在我的数据中往往有一些没有 id 的事件为空,所以你的会话计数包括它们但 GA4 确实如此。所以使用你的首选如果这对您来说是个问题,则计算空值的方法。
2): You are also doing an exact count which again is correct but GA4 does an approximant match see link below for details. 2): 你也在做一个精确的计数,这也是正确的,但 GA4 做了一个近似匹配,详情见下面的链接。
https://developers.google.com/analytics/blog/2022/hll#using_bigquery_hll_functions_with_google_analytics_event_data https://developers.google.com/analytics/blog/2022/hll#using_bigquery_hll_functions_with_google_analytics_event_data
Using the above two techniques I can get a lot closer to the GA4 number of session but they are still not attributed correctly使用以上两种技术,我可以更接近 GA4 会话数,但它们仍然没有正确归因
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.