[英]How to access spark executor, worker and master level metrics in databricks?
[英]Spark Metrics: how to access executor and worker data?
注意:我在YARN上使用Spark
我一直在嘗試在Spark中實現的度量系統 。 我啟用了ConsoleSink和CsvSink,並為所有四個實例(驅動程序,主機,執行程序,工作程序)啟用了JvmSource。 但是我只有驅動程序輸出,並且在控制台和csv目標目錄中沒有worker / executor / master數據。
閱讀完此問題后 ,我想知道提交工作時是否必須將某些物品運送給執行者。
我的提交命令: ./bin/spark-submit --class org.apache.spark.examples.SparkPi lib/spark-examples-1.5.0-hadoop2.6.0.jar 10
submit ./bin/spark-submit --class org.apache.spark.examples.SparkPi lib/spark-examples-1.5.0-hadoop2.6.0.jar 10
波紋管是我的metric.properties文件:
# Enable JmxSink for all instances by class name
*.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink
# Enable ConsoleSink for all instances by class name
*.sink.console.class=org.apache.spark.metrics.sink.ConsoleSink
# Polling period for ConsoleSink
*.sink.console.period=10
*.sink.console.unit=seconds
#######################################
# worker instance overlap polling period
worker.sink.console.period=5
worker.sink.console.unit=seconds
#######################################
# Master instance overlap polling period
master.sink.console.period=15
master.sink.console.unit=seconds
# Enable CsvSink for all instances
*.sink.csv.class=org.apache.spark.metrics.sink.CsvSink
#driver.sink.csv.class=org.apache.spark.metrics.sink.CsvSink
# Polling period for CsvSink
*.sink.csv.period=10
*.sink.csv.unit=seconds
# Polling directory for CsvSink
*.sink.csv.directory=/opt/spark-1.5.0-bin-hadoop2.6/csvSink/
# Worker instance overlap polling period
worker.sink.csv.period=10
worker.sink.csv.unit=second
# Enable Slf4jSink for all instances by class name
#*.sink.slf4j.class=org.apache.spark.metrics.sink.Slf4jSink
# Polling period for Slf4JSink
#*.sink.slf4j.period=1
#*.sink.slf4j.unit=minutes
# Enable jvm source for instance master, worker, driver and executor
master.source.jvm.class=org.apache.spark.metrics.source.JvmSource
worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource
driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource
executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource
這是Spark創建的csv文件的列表。 我期待為Spark執行程序(也是JVM)訪問相同的數據。
app-20160812135008-0013.driver.BlockManager.disk.diskSpaceUsed_MB.csv
app-20160812135008-0013.driver.BlockManager.memory.maxMem_MB.csv
app-20160812135008-0013.driver.BlockManager.memory.memUsed_MB.csv
app-20160812135008-0013.driver.BlockManager.memory.remainingMem_MB.csv
app-20160812135008-0013.driver.jvm.heap.committed.csv
app-20160812135008-0013.driver.jvm.heap.init.csv
app-20160812135008-0013.driver.jvm.heap.max.csv
app-20160812135008-0013.driver.jvm.heap.usage.csv
app-20160812135008-0013.driver.jvm.heap.used.csv
app-20160812135008-0013.driver.jvm.non-heap.committed.csv
app-20160812135008-0013.driver.jvm.non-heap.init.csv
app-20160812135008-0013.driver.jvm.non-heap.max.csv
app-20160812135008-0013.driver.jvm.non-heap.usage.csv
app-20160812135008-0013.driver.jvm.non-heap.used.csv
app-20160812135008-0013.driver.jvm.pools.Code-Cache.committed.csv
app-20160812135008-0013.driver.jvm.pools.Code-Cache.init.csv
app-20160812135008-0013.driver.jvm.pools.Code-Cache.max.csv
app-20160812135008-0013.driver.jvm.pools.Code-Cache.usage.csv
app-20160812135008-0013.driver.jvm.pools.Code-Cache.used.csv
app-20160812135008-0013.driver.jvm.pools.Compressed-Class-Space.committed.csv
app-20160812135008-0013.driver.jvm.pools.Compressed-Class-Space.init.csv
app-20160812135008-0013.driver.jvm.pools.Compressed-Class-Space.max.csv
app-20160812135008-0013.driver.jvm.pools.Compressed-Class-Space.usage.csv
app-20160812135008-0013.driver.jvm.pools.Compressed-Class-Space.used.csv
app-20160812135008-0013.driver.jvm.pools.Metaspace.committed.csv
app-20160812135008-0013.driver.jvm.pools.Metaspace.init.csv
app-20160812135008-0013.driver.jvm.pools.Metaspace.max.csv
app-20160812135008-0013.driver.jvm.pools.Metaspace.usage.csv
app-20160812135008-0013.driver.jvm.pools.Metaspace.used.csv
app-20160812135008-0013.driver.jvm.pools.PS-Eden-Space.committed.csv
app-20160812135008-0013.driver.jvm.pools.PS-Eden-Space.init.csv
app-20160812135008-0013.driver.jvm.pools.PS-Eden-Space.max.csv
app-20160812135008-0013.driver.jvm.pools.PS-Eden-Space.usage.csv
app-20160812135008-0013.driver.jvm.pools.PS-Eden-Space.used.csv
app-20160812135008-0013.driver.jvm.pools.PS-Old-Gen.committed.csv
app-20160812135008-0013.driver.jvm.pools.PS-Old-Gen.init.csv
app-20160812135008-0013.driver.jvm.pools.PS-Old-Gen.max.csv
app-20160812135008-0013.driver.jvm.pools.PS-Old-Gen.usage.csv
app-20160812135008-0013.driver.jvm.pools.PS-Old-Gen.used.csv
app-20160812135008-0013.driver.jvm.pools.PS-Survivor-Space.committed.csv
app-20160812135008-0013.driver.jvm.pools.PS-Survivor-Space.init.csv
app-20160812135008-0013.driver.jvm.pools.PS-Survivor-Space.max.csv
app-20160812135008-0013.driver.jvm.pools.PS-Survivor-Space.usage.csv
app-20160812135008-0013.driver.jvm.pools.PS-Survivor-Space.used.csv
app-20160812135008-0013.driver.jvm.PS-MarkSweep.count.csv
app-20160812135008-0013.driver.jvm.PS-MarkSweep.time.csv
app-20160812135008-0013.driver.jvm.PS-Scavenge.count.csv
app-20160812135008-0013.driver.jvm.PS-Scavenge.time.csv
app-20160812135008-0013.driver.jvm.total.committed.csv
app-20160812135008-0013.driver.jvm.total.init.csv
app-20160812135008-0013.driver.jvm.total.max.csv
app-20160812135008-0013.driver.jvm.total.used.csv
DAGScheduler.job.activeJobs.csv
DAGScheduler.job.allJobs.csv
DAGScheduler.messageProcessingTime.csv
DAGScheduler.stage.failedStages.csv
DAGScheduler.stage.runningStages.csv
DAGScheduler.stage.waitingStages.csv
由於您沒有給出嘗試的命令,因此我假設您沒有傳遞metrics.properties。 要傳遞metrics.propertis,該命令應為
spark-submit <other parameters> --files metrics.properties
--conf spark.metrics.conf=metrics.properties
注意必須在--files和--conf中指定metrics.properties,這些--files會將metrics.properties文件傳輸到執行程序。 由於您可以在驅動程序而不是執行程序上看到輸出,因此我認為您缺少--files選項。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.