简体   繁体   中英

What happens internally when we restart Azure Databricks cluster?

When we get many stage failures, we generally restart cluster to avoid stage failures. I want to know

1)What exactly is happening when we restart it.

2)Is it removing metadata / cache from the cluster?

3)Is there any other way to meet the above requirement without restarting cluster.

When you restart the cluster, the spark application is initialized over again, like literally from scratch all cache in clusters are wiped.

You will see this evident in cluster driver logs when you restart, spark initialize and boots all libraries loads metastore and DBFS.

One thing a immediate a quick restart (not more than ~5 mins gap) does not do is not deprovisioning the underlying VM instance hosting the application. If you think the VM is in bad state terminate - give a gap of 5 mins and start again. ( this does not work clusters over pool as pools sustain VMs even after termination.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM