简体   繁体   English

在 BigQuery 中使用 Jinja 的 dbt 宏

[英]dbt macro using Jinja in BigQuery

I am trying to compare some columns in an automated way, the column types that i am comparing can be: float, integer, string or date / timestamps but I am having issues with the syntax and how to best cover the last condition ie when I have dates/timestamps我正在尝试以自动方式比较某些列,我正在比较的列类型可以是:float、integer、字符串或日期/时间戳,但我在语法以及如何最好地涵盖最后一个条件方面遇到问题,即当我有日期/时间戳

This is what I have so far but somehow i am getting type errors.这是我到目前为止所拥有的,但不知何故我遇到了类型错误。 Plus I am not sure if what i have done would work or not since i am stuck at the very initial stage.另外,我不确定我所做的是否会奏效,因为我还停留在最初的阶段。

{% macro find_mismatch_s1_s2_(s1_col, s2_col,field) -%}
{%if type(s1_col) is string %}
  if(coalesce({{s1_col}},"") != coalesce({{s2_col}},""),true,false ) as is_{{field}}_mismatch
{%- elif type(s1_col) is float or type(s1_col) is integer -%}
  if(coalesce({{s1_col}},0) != coalesce({{s2_col}},0),true,false) as is_{{field}}_mismatch
{% else %}
  if(coalesce({{s1_col}},{{another_macro_that_gets_a_date}}) != coalesce({{s2_col}},{{another_macro_that_gets_a_date}}),true,false) as is_{{field}}_mismatch
{% endif %}
{% endmacro %}
  • First I am checking if the column type is string if yes, then compare the values of two columns, if values are null then use blank to compare首先我检查列类型是否为字符串如果是,然后比较两列的值,如果值为null则使用空白进行比较
  • Second I am checking if the column types are float or integer, if yes compare two columns and if there is null then compare with 0其次,我正在检查列类型是浮点型还是 integer,如果是,则比较两列,如果有 null,则与 0 进行比较
  • lastly, I am covering the case of dates/timestamps, I couldn't find the type() for the date / timestamp so I added it to the else statement, which might not be the best thing, would be great if someone can also recommend how to best cover these cases.最后,我介绍了日期/时间戳的情况,我找不到日期/时间戳的 type() 所以我将它添加到 else 语句中,这可能不是最好的,如果有人也可以的话会很棒建议如何最好地涵盖这些案例。
  • The reason I am adding the coalesce statements is because if any of the value in one of the columns is null, the is_{{field}}_mismatch would return false, even though it should be true我添加coalesce语句的原因是因为如果其中一列中的任何值是 null, is_{{field}}_mismatch将返回 false,即使它应该是 true
  • if I enter empty strings as a part of the coalesce for the second and third cases, it will run into error due to column type, hence I have to add these in seperate if statements.如果我在第二种和第三种情况下输入空字符串作为coalesce的一部分,它将因列类型而出错,因此我必须在单独的 if 语句中添加它们。

The data in your database does not flow through the code you write in Jinja -- jinja templates a sql query before that query is executed.数据库中的数据不会流经您在 Jinja 中编写的代码——Jinja 在执行该查询之前模板化了一个 sql 查询。

When you write type(s1_col) , you might expect to get the data type of the data in the database column named s1_col , but actually you will always get a string type, since the variable s1_col is a string that holds the name of a database column.当您编写type(s1_col)时,您可能希望获得名为s1_col的数据库列中数据的数据类型,但实际上您将始终获得string类型,因为变量s1_col是一个包含数据库名称的字符串柱子。

You can write this logic in a type-agnostic way by returning true if either column is null (in sql, not jinja):如果任一列为 null(在 sql 中,不是 jinja),您可以通过返回 true 以类型不可知的方式编写此逻辑:

{% macro find_mismatch_s1_s2_(s1_col, s2_col,field) -%}
case
    when {{ s1_col }} is null and {{ s2_col }} is not null
    then true
    when {{ s2_col }} is null and {{ s1_col }} is not null
    then true
    when {{ s1_col }} is null and {{ s2_col }} is null
    then false
    else {{ s1_col }} != {{ s2_col }}
end as is_{{field}}_mismatch
{% endmacro %}

Another option, if you really need the data type, is to use dbt's adapter class, specifically the get_columns_in_relation method.如果您确实需要数据类型,另一种选择是使用 dbt 的adapter class,特别是get_columns_in_relation方法。 The returned List of Columns have a data_type property.返回的列列表具有data_type属性。 See this answer for more info.有关详细信息,请参阅此答案

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM