简体繁体 English

ELK apache spark 应用日志

[英]ELK apache spark application log

原文 2018-09-25 08:25:38 4 1 apache-spark/ elasticsearch/ filebeat

How to configure Filebeats to read apache spark application log.如何配置 Filebeats 以读取 apache spark 应用程序日志。 The logs generated is moved to history server, in non readable format as soon as the application is completed.一旦应用程序完成，生成的日志就会以不可读的格式移动到历史服务器。 What is the ideal way here.这里的理想方式是什么。

1 个解决方案

You can configure Spark logging via Log4J .您可以通过 Log4J配置 Spark 日志记录。 For a discussion around some edge cases for setting up log4j configuration, see SPARK-16784 , but if you simply want to collect all application logs coming off a cluster (vs logs per job) you shouldn't need to consider any of that.有关设置 log4j 配置的一些边缘情况的讨论，请参阅SPARK-16784 ，但如果您只是想收集来自集群的所有应用程序日志（与每个作业的日志相比），则不需要考虑任何这些。

On the ELK side, there was a log4j input plugin for logstash , but it is deprecated.在 ELK 方面， logstash有一个 log4j 输入插件，但它已被弃用。

Thankfully, the documentation for the deprecated plugin describes how to configure log4j to write data locally for FileBeat, and how to set up FileBeat to consume this data and sent it to a Logstash instance.值得庆幸的是，已弃用插件的文档描述了如何配置 log4j 以在本地为 FileBeat 写入数据，以及如何设置 FileBeat 以使用这些数据并将其发送到 Logstash 实例。 This is now the recommended way to ship logs from systems using log4j.这是现在推荐的使用 log4j 从系统发送日志的方法。

So in summary, the recommended way to get logs from Spark into ELK is:所以总而言之，从 Spark 获取日志到 ELK 的推荐方法是：

Set the Log4J configuration for your Spark cluster to write to local files为您的 Spark 集群设置 Log4J 配置以写入本地文件
Run FileBeat to consume from these files and sent to logstash运行 FileBeat 从这些文件中消费并发送到 logstash
Logstash will send data into Elastisearch Logstash 将数据发送到 Elastisearch
You can search through your indexed log data using Kibana您可以使用 Kibana 搜索索引日志数据