简体   繁体   English

Spark本地模式:如何查询执行器插槽数?

[英]Spark local mode: How to query the number of executor slots?

I'm following tutorial Using Apache Spark 2.0 to Analyze the City of San Francisco's Open Data where it's claimed that the "local mode" Spark cluster available in Databricks "Community Edition" provides you with 3 executor slots. 我正在学习使用Apache Spark 2.0来分析旧金山市的开放数据的教程,其中声称Databricks“社区版”中的“本地模式” Spark集群为您提供了3个执行器插槽。 (So 3 Tasks should be able to run concurrently.) (因此3个任务应该能够同时运行。)

However, when I look at the "Event Timeline" visualization for job stages with multiple tasks in my own notebook on Databricks "Community Edition", it looks like up to 8 tasks were running concurrently: 但是,当我在Databricks“社区版”上的自己的笔记本中查看具有多个任务的工作阶段的“事件时间线”可视化时,看起来多达8个任务正在同时运行:

Spark UI中的事件时间线,最多可同时执行8个任务

Is there a way to query the number of executor slots from PySpark or from a Databricks notebook? 有没有办法从PySpark或Databricks笔记本查询执行器插槽的数量 Or can I directly see the number in the Spark UI somewhere? 或者我可以直接在Spark UI的某处看到数字吗?

Databricks "slots" = Spark "cores" = available threads Databricks“插槽” = Spark“核心” =可用线程

"Slots" is a term Databricks uses (or used?) for the threads available to do parallel work for Spark. “插槽”是Databricks对用于Spark并行工作的线程使用(或使用?)的术语。 The Spark documentation and Spark UI calls the same concept "cores" , even though they are unrelated to physical CPU cores. Spark文档和Spark UI 将相同的概念称为“核心” ,即使它们与物理CPU核心无关。

(See this answer on Hortonworks community, and this "Spark Tutorial: Learning Apache Spark" databricks notebook .) (请参阅Hortonworks社区上的此答案以及本“火花教程:学习Apache Spark” databricks笔记本 。)

View number of slots/cores/threads in Spark UI (on Databricks) 查看Spark UI中的插槽/核心/线程数(在Databricks上)

To see how many there are in your Databricks cluster, click "Clusters" in the navigation area to the left, then hover over the entry for your cluster and click the "Spark UI" link. 要查看您的Databricks集群中有多少个集群,请单击左侧导航区域中的“集群”,然后将鼠标悬停在集群的条目上,然后单击“ Spark UI”链接。 In the Spark UI, click the "Executors" tab. 在Spark UI中,单击“执行程序”选项卡。

带注释的屏幕截图:如何为Databricks集群打开Spark UI

You can see the number of executor cores (=executor slots) in both the summary and for each individual executor 1 in the "cores" column of the respective table there: 您可以在摘要中以及相应表格的“核心”列中看到执行者核心的数量(=执行者插槽),以及每个执行者1的数量:

用于执行者的Spark UI:每个执行者的摘要表和表格(此处仅一个执行者)

1 There's only one executor in "local mode" clusters, which are the cluster available in Databricks community edition. 1 “本地模式”集群中只有一个执行程序,这是Databricks社区版中可用的集群。

Query number of slots/cores/threads 查询插槽/核心/线程数

How to query this number from within a notebook, I'm not sure. 我不确定如何从笔记本中查询此号码。

spark.conf.get('spark.executor.cores')

results in java.util.NoSuchElementException: spark.executor.cores 结果为java.util.NoSuchElementException: spark.executor.cores

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM