简体   繁体   English

数据块笔记本中的多个单元格

[英]Multiple cells in databricks notebook

I am new to databricks.我是数据块的新手。 Question is why there are multiple cells in notebook, when we can write whole set of instructions/program in 1 single cell?问题是为什么笔记本中有多个单元格,而我们什么时候可以在一个单元格中编写整套指令/程序?

Regards,问候,

The advantage of using Multiple cells is you can break your big code in small portions (in each cell) and can execute that cell individually without the need to execute the complete code which may take long time because of Big Analysis, Large Datasets, Exploratory Data Analysis, Transformation, etc.使用多个单元的优点是您可以将大代码分成小部分(在每个单元中),并且可以单独执行该单元,而无需执行完整代码,因为大分析、大型数据集、探索性数据可能需要很长时间分析、转换等

In other words, we can say that since Databricks is a Big Data Analysis Tool which involves Large Dataset (millions of rows) ingestion, cleaning of dataset, transformation and then implementing data analysis and machine learning algorithms.换句话说,我们可以说,由于 Databricks 是一个大数据分析工具,涉及大型数据集(数百万行)的摄取、数据集的清理、转换,然后实现数据分析和机器学习算法。 All these tasks require large compute resources if you run in single cell.如果您在单个单元中运行,所有这些任务都需要大量计算资源。 Therefore, you can divide the above mentioned tasks in each cell in Databricks Notebook and run them individually.因此,您可以在 Databricks Notebook 的每个单元格中划分上述任务并单独运行它们。

Eg: If you are ingesting data from Azure Data Lake Storage account (ADLS), you can create a mount point to the required storage resource and path in a cell and run this cell individually.例如:如果您正在从 Azure Data Lake Storage 帐户 (ADLS) 摄取数据,您可以在一个单元中为所需的存储资源和路径创建一个挂载点,然后单独运行该单元。 Now your ADLS container is mounted you can use another cell to prepare the data.现在您的 ADLS 容器已安装,您可以使用另一个单元来准备数据。 In this way, you don't need to mount the resource again as it is already done in previous cell.这样,您就不需要再次挂载资源,因为它已经在之前的单元格中完成了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 触发 Databricks Notebook 的方法 - Ways to Trigger a Databricks Notebook 如何将已经存在的数据块笔记本移动到回购协议中? - How to move an already extant databricks notebook into a repo? 我可以遍历数据块笔记本中的小部件吗? - Can I iterate through the widgets in a databricks notebook? 通过 Azure Databricks Notebook 传递参数 URL - Passing parameter via Azure Databricks Notebook URL Databricks Python Notebook中如何输入googleapiclient授权码 - How to input googleapiclient authorization code in Databricks Python Notebook 我如何从 Azure Devops 自动化 Databricks 笔记本 - How do i automate Databricks notebook from Azure Devops Azure Databricks API,如何通过 API 将集群附加到上传的笔记本 - Azure Databricks API, how to attach a cluster to an uploaded notebook via API 如何使用 Databricks Notebook 从 Key:Value Pair 中提取值 - How to pull out value from Key:Value Pair with Databricks Notebook Azure Databricks Jupyter Notebook Python 和 R 在一个单元格中 - Azure Databricks Jupyter Notebook Python and R in one cell 您不能在运行 Databricks Basic 的集群上运行笔记本作业 - You cannot run a notebook job on a cluster running Databricks Basic
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM