简体繁体 English

Snowflake 存储过程最佳实践

[英]Snowflake stored procedure best practices

原文 2022-04-15 11:25:53 2 1 database/ stored-procedures/ snowflake-cloud-data-platform

Please help me in understanding the best practices that can be followed in using stored procedures that can be called to run tasks simultaneously with tree of tasks within the root tasks called in the stored procedure.请帮助我理解在使用存储过程时可以遵循的最佳实践，这些存储过程可以被调用以与存储过程中调用的根任务中的任务树同时运行任务。

Is this a recommended way of doing a data load?这是进行数据加载的推荐方法吗？ What are the efficiency impacts for such stored procedure execution?这样的存储过程执行对效率有什么影响？

Do share other best practices to follow for data loading through stored procedures for fact and dimension tables that I can follow apart from the above.分享其他最佳实践，以通过事实和维度表的存储过程进行数据加载，我可以在上述之外遵循这些实践。 TIA TIA

1 个解决方案

Typically a TASK would call a SPROC not the other way around.通常，任务会调用SPROC ，而不是相反。

You can use TASKS to build out trees allowing for parallel execution of child tasks.您可以使用 TASKS 构建树，允许并行执行子任务。

You can schedule your TASK with interval or cron, or manually execute a task with EXECUTE TASK您可以使用 interval 或 cron 安排您的任务，或者使用EXECUTE TASK 手动执行任务

The data load part of your question is very broad and very specific to your use case:您问题的数据加载部分非常广泛并且非常针对您的用例：

Bulk大部分
Streaming串流
Data Source (object storage, local file system, in another DB, API, etc)数据源（对象存储、本地文件系统、在另一个数据库中、API 等）
Snowflake Native vs. External tooling (Fivetran, Matillion, HVR, etc) Snowflake 本机与外部工具（Fivetran、Matillion、HVR 等）