简体   繁体   English

不同的结果 UNNEST BigQuery

[英]Different results UNNEST BigQuery

I don't understand what is the differents between those queries:我不明白这些查询之间的区别是什么:

SELECT event_timestamp ,user_pseudo_id, value.double_value as tax
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`, UNNEST(event_params) as event_params
WHERE event_name = "purchase" and event_params.key = "tax" 

The other query is:另一个查询是:

SELECT event_timestamp ,user_pseudo_id, 
(SELECT value.double_value FROM UNNEST(event_params) WHERE key = "tax") as tax
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*` 
WHERE event_name = "purchase"

In the first query, I get 5.242 registers and in the second 5.692.在第一个查询中,我得到 5.242 个寄存器,在第二个查询中得到 5.692 个。 What is the mistake?错误是什么?

Thank you!谢谢!

It depends on what you define as accurate.这取决于您将什么定义为准确。 The reason you are getting a row count mismatches is because of the way the tax field is being handled.行数不匹配的原因是处理税字段的方式。 You can see this by running the following query to see the discrepancies:您可以通过运行以下查询来查看差异:

with unnested as (
    SELECT event_timestamp ,user_pseudo_id, value.double_value as tax
    FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`, UNNEST(event_params) as event_params
    WHERE event_name = "purchase" and event_params.key = "tax"
 ) 
SELECT events.event_timestamp ,events.user_pseudo_id, 
(SELECT value.double_value FROM UNNEST(event_params) WHERE key = "tax") as tax
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*` events
LEFT JOIN unnested un
 on events.event_timestamp=un.event_timestamp
 and events.user_pseudo_id=un.user_pseudo_id
WHERE events.event_name = "purchase"
and un.event_timestamp is null
;

If you pick out a single record from that list and investigate with the two following queries:如果您从该列表中挑选出一条记录并使用以下两个查询进行调查:

    SELECT *
    FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`, UNNEST(event_params) as event_params
    WHERE 1=1
    -- and event_name = "purchase" and event_params.key = "tax"
    and event_name = "purchase" and event_timestamp=1608955242902332 and user_pseudo_id='43627350.3807676886';


    SELECT 
    *,
    (SELECT value.double_value FROM UNNEST(event_params) WHERE key = "tax") as tax
    FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*` events
    WHERE events.event_name = "purchase"
    and event_timestamp=1608955242902332 and user_pseudo_id='43627350.3807676886'
    ;

The first query is filtering out the records without a tax field from your final set, while the second returns the records as having a null tax value.第一个查询从最终集中过滤掉没有税字段的记录,而第二个查询返回具有 null 税值的记录。 If the number registered is dependent on the presence of a value in the tax field the 5242 value is the correct number.如果注册的号码取决于税字段中值的存在,则 5242 值是正确的号码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM