简体   繁体   中英

Real Time Cluster Log Delivery in a Databricks Cluster

I have some Python code that I am running on a Databricks Job Cluster. My Python code will be generating a whole bunch of logs and I want to be able to monitor these logs in real time (or near real time), say through something like a dashboard.

What I have done so far is, I have configured my cluster log delivery location and my logs are delivered to the specified destination every 5 minutes.

This is explained here,
https://docs.microsoft.com/en-us/azure/databricks/clusters/configure

Here is an extract from the same article,

When you create a cluster, you can specify a location to deliver the logs for the Spark driver node, worker nodes, and events. Logs are delivered every five minutes to your chosen destination. When a cluster is terminated, Azure Databricks guarantees to deliver all logs generated up until the cluster was terminated.

Is there some way I can have these logs delivered somewhere in near real time, rather than every 5 minutes? It does not have to be through the same method either, I am open to other possibilities.

As shown in below screenshot by default it is 5 minutes. Unfortunately, it cannot be changed. There is no information given in official documentation .

在此处输入图像描述

However, you can raise feature request here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM