简体   繁体   English

Databricks + ADF + ADLS2 + Hive = Azure Synapse

[英]Databricks + ADF + ADLS2 + Hive = Azure Synapse

I have no experience with Azure Synapse but my understanding is that is the same as Databricks, ADF, ADLS2 and Hive in SQL DWH, all together in one workspace with a different name.我没有使用 Azure Synapse 的经验,但我的理解是,它与 SQL DWH 中的 Databricks、ADF、ADLS2 和 Hive 相同,都在同一个工作区中,但名称不同。

Am I wrong?我错了吗?

Yes, in many context Azure Synapse and Databricks provide the same Big Data Analytics approach but there are also few differences between these services.是的,在许多情况下,Azure Synapse 和 Databricks 提供相同的大数据分析方法,但这些服务之间也几乎没有区别。

With the new functionalities in Synapse now, we see some similar functionalities as in Databricks (eg Spark, Delta) which raises the question on how Synapse compares to Databricks and when to use which.现在,随着 Synapse 中的新功能,我们看到了一些与 Databricks(例如 Spark、Delta)相似的功能,这引发了关于 Synapse 与 Databricks 的比较以及何时使用哪个的问题。

  • Yes, both have Spark but…是的,两者都有 Spark,但是……

    • Databricks数据块

      • has a proprietary data processing engine (Databricks Runtime) built on a highly optimized version of Apache Spark offering 50x performance拥有一个专有的数据处理引擎 (Databricks Runtime),它构建在高度优化的 Apache Spark 版本上,提供 50 倍的性能
      • already has support for Spark 3.0已经支持 Spark 3.0
      • allows users to opt for GPU enabled clusters and choose between standard and high-concurrency cluster mode允许用户选择启用 GPU 的集群并在标准和高并发集群模式之间进行选择
    • Synapse突触

      • Open-source Apache Spark (thus not including all features of Databricks Runtime)开源 Apache Spark(因此不包括 Databricks Runtime 的所有功能)
      • has built-in support for .NET for Spark applications为 Spark 应用程序内置了 .NET 支持
  • Yes, both have notebooks是的,两者都有笔记本

    • Synapse突触

      • Nteract Notebooks互动笔记本

      • has co-authoring of Notebooks, but one person needs to save the Notebook before another person sees the change与 Notebooks 共同创作,但一个人需要在另一个人看到更改之前保存 Notebook

      • doesn't have automated versioning没有自动版本控制

    • Databricks数据块

      • Databricks Notebooks Databricks 笔记本

      • Has real-time co-authoring (both authors see the changes in real-time) Automated versioning具有实时共同创作(两位作者都可以实时查看更改) 自动版本控制

  • Yes, both can access data from a data lake是的,两者都可以从数据湖访问数据

    • Synapse突触

      • When creating Synapse, you can select a data lake which will be your primary data lake (can query it directly from the scripts and notebooks)创建 Synapse 时,您可以选择一个数据湖作为您的主要数据湖(可以直接从脚本和笔记本中查询)
    • Databricks数据块

      • You need to mount a data lake before using it使用前需要挂载数据湖
  • Yes, both leverage Delta是的,两者都利用 Delta

    • Synapse突触

      • Delta Lake is open source Delta Lake 是开源的
    • Databricks数据块

      • Has Databricks Delta which is built on the open source but offers some extra optimizations有 Databricks Delta,它建立在开源之上,但提供了一些额外的优化
  • No, they are not the same不,它们不一样

    • Synapse突触

      • Has both a traditional SQL engine (to fit the traditional BI developers) as well as a Spark engine (to fit data scientists, analysts & engineers)既有传统的 SQL 引擎(适合传统的 BI 开发人员),也有 Spark 引擎(适合数据科学家、分析师和工程师)

      • Is a data warehouse (ie Synapse Analytics) + an interface tool (ie Synapse Studio)是一个数据仓库(即Synapse Analytics)+一个界面工具(即Synapse Studio)

    • Databricks数据块

      • Is not a data warehouse tool but rather a Spark-based notebook tool Has a focus on Spark, Delta Engine, MLflow and MLR不是数据仓库工具,而是基于 Spark 的 notebook 工具重点关注 Spark、Delta Engine、MLflow 和 MLR
  • No, they don't offer the same developer experience不,他们不提供相同的开发人员体验

    • Synapse突触

      • Offers for Spark-development a developer experience currently only through Synapse Studio (not through local IDEs)目前仅通过 Synapse Studio(而不是通过本地 IDE)为 Spark 开发提供开发人员体验

      • Doesn't have Git yet integrated within the Synapse Studio Notebooks尚未将 Git 集成到 Synapse Studio Notebooks 中

    • Databricks数据块

      • Offers a developer experience within Databricks UI, Databricks Connect (ie remote connect from Visual Studio Code, Pycharm, etc.) and soon Jupyter & RStudio UI within Databricks在 Databricks UI、Databricks Connect(即从 Visual Studio Code、Pycharm 等远程连接)以及很快在 Databricks 中提供 Jupyter 和 RStudio UI 中的开发人员体验

Check When to use Synapse and when Databricks?检查何时使用 Synapse,何时使用 Databricks? . .

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM