[英]while doing incremental using dbt i want to to aggregation if that row exist else insert
I am using DBT to incremental load data from one schema in redshift to another to create reports.我正在使用 DBT 将数据从 redshift 中的一个模式增量加载到另一个模式以创建报告。 In DBT there is straight forward way to incrementally load data with upsert.
在 DBT 中,有一种直接的方式来使用 upsert 增量加载数据。 But instead of doing the traditional upsert.
但不是做传统的upsert。 I want to take sum (on the unique id for the rest of the columns in the table) of the incoming rows and old rows in the destination table if they already exist else do insert them.
如果它们已经存在,我想对目标表中的传入行和旧行求和(在表中列的 rest 的唯一 ID 上),否则插入它们。 Say for example I have a table.
比如说我有一张桌子。
T1(userid, total_deposit, total_withdrawal)
i have created a table that calculates total deposit and total withdrawal for a user, when i do an incremental query i might get new deposit or withdrawal the for existing user, in that case, I'll have to add the value in existing table instead of replacing it using upsert.我创建了一个计算用户总存款和总取款的表,当我进行增量查询时,我可能会为现有用户获得新的存款或取款,在这种情况下,我将不得不在现有表中添加值使用 upsert 替换它。 And if the user is new I just need to do simple insert.
如果用户是新用户,我只需要进行简单的插入即可。 Any suggestion on how to approach this?
关于如何解决这个问题的任何建议?
dbt is quite opinionated that invocations of dbt should be idempotent. dbt 认为 dbt 的调用应该是幂等的。 This means that you can run the same command over and over again, and the result will be the same.
这意味着您可以一遍又一遍地运行相同的命令,结果将是相同的。
The operation you're describing is not idempotent, so you're going to have a hard time getting it to work with dbt out of the box.您描述的操作不是幂等的,因此您将很难让它与开箱即用的 dbt 一起工作。
As an alternative, I would break this into two steps:作为替代方案,我会将其分为两个步骤:
user_id
as the unique_key
(since you have all of the raw transactions in #1), but I'd start without that and make sure that's absolutely necessary for performance reasons, since it will add a fair bit of complexity.user_id
作为unique_key
(因为您在 #1 中拥有所有原始事务),但我会从没有它开始,并确保出于性能原因这是绝对必要的,因为它会增加了相当多的复杂性。 For more info on complex incremental materializations, I suggest this discourse post written by Tristan Handy, Founder & CEO at dbt Labs有关复杂增量实现的更多信息,我建议您阅读 dbt Labs 创始人兼首席执行官 Tristan Handy 撰写的这篇演讲文章
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.