简体   繁体   English

从火花日志中提取指标

[英]pull out metrics from spark logs

在此处输入图像描述 how do I pull out these metrics from spark history logs?如何从 spark 历史日志中提取这些指标? Is there some api I can pull these from?有一些 api 我可以从中提取这些吗?

I tried downloading the json event logs, but I can't grep for the numbers seen in the photo我尝试下载 json 事件日志,但我无法下载 grep 以获取照片中显示的数字

The spark history server keeps all that information for you. spark 历史服务器为您保留所有这些信息。 You can access it via a rest API.您可以通过 rest API 访问它。

If you are on EMR :如果您使用EMR

You can view the Spark web UIs by following the procedures to create an SSH tunnel or create a proxy in the section called Connect to the cluster in the Amazon EMR Management Guide and then navigating to the YARN ResourceManager for your cluster.您可以按照创建 SSH 隧道或创建代理的过程查看 Spark web UI,方法是在 Amazon EMR 管理指南中称为“连接到集群”的部分中创建代理,然后导航到集群的 YARN ResourceManager。 Choose the link under Tracking UI for your application.在您的应用程序的跟踪 UI 下选择链接。 If your application is running, you see ApplicationMaster.如果您的应用程序正在运行,您会看到 ApplicationMaster。 This takes you to the application master's web UI at port 20888 wherever the driver is located.这会将您带到应用程序主机的 web UI 端口 20888,无论驱动程序位于何处。 The driver may be located on the cluster's primary node if you run in YARN client mode.如果您在 YARN 客户端模式下运行,驱动程序可能位于集群的主节点上。 If you are running an application in YARN cluster mode, the driver is located in the ApplicationMaster for the application on the cluster.如果您在 YARN 集群模式下运行应用程序,则驱动程序位于集群上应用程序的 ApplicationMaster 中。 If your application has finished, you see History, which takes you to the Spark HistoryServer UI port number at 18080 of the EMR cluster's primary node.如果您的应用程序已完成,您会看到历史记录,它会将您带到 EMR 集群主节点的 18080 处的 Spark HistoryServer UI 端口号。 This is for applications that have already completed.这适用于已经完成的应用程序。 You can also navigate to the Spark HistoryServer UI directly at http://master-public-dns-name:18080/.您还可以通过 http://master-public-dns-name:18080/ 直接导航到 Spark HistoryServer UI。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将日志和指标从 ECS Fargate 容器发送到 Datadog - Sending logs and metrics from ECS Fargate containers to Datadog 各个 dataproc 火花日志在哪里? - where are the individual dataproc spark logs? Stackdriver 基于日志的指标 - 需要 alignment 期间的总和 - Stackdriver Logs-Based Metrics - need sum over alignment period 将指标从 Telegraf 发送到 BigQuery - Shipping metrics from Telegraf to BigQuery 从 Dataproc 集群执行 spark 作业时,执行程序检测信号在 125009 毫秒后超时 - Executor heartbeat timed out after 125009 ms while executing spark jobs from Dataproc cluster 如何使用现有日志条目在 Cloud Monitoring 中创建自定义指标? - How can I create custom metrics in Cloud Monitoring using existing Logs entries? 目标可访问但 ECS 服务日志显示请求超时 - Target accessible but ECS service logs says Request timed out AWS | 从 SFTP 拉取数据 - AWS | Data pull from SFTP 有没有办法从 AWS Glue 作业发布自定义指标? - Is there a way to publish custom metrics from AWS Glue jobs? 如何从端点内访问 sagemaker 模型注册表指标 - How to access sagemaker model registry metrics from within the endpoint
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM