簡體   English   中英

以編程方式殺死 Spark 作業

[英]Kill Spark Job programmatically

我正在通過 Jupyter notebook 運行 pyspark 應用程序。 我可以使用 Spark Web UI 終止工作,但我想以編程方式終止它。

怎么殺啊???

要擴展@Netanel Malka 的答案,您可以使用 cancelAllJobs 方法取消每個正在運行的作業,或者可以使用 cancelJobGroup 方法取消已組織為一組的作業。

從 PySpark 文檔:

cancelAllJobs()
Cancel all jobs that have been scheduled or are running.

cancelJobGroup(groupId)
Cancel active jobs for the specified group. See SparkContext.setJobGroup for more information.

以及文檔中的一個示例:

import threading
from time import sleep
result = "Not Set"
lock = threading.Lock()

def map_func(x):
    sleep(100)
    raise Exception("Task should have been cancelled")

def start_job(x):
    global result
    try:
        sc.setJobGroup("job_to_cancel", "some description")
        result = sc.parallelize(range(x)).map(map_func).collect()
    except Exception as e:
        result = "Cancelled"
    lock.release()

def stop_job():
    sleep(5)
    sc.cancelJobGroup("job_to_cancel")

suppress = lock.acquire()
suppress = threading.Thread(target=start_job, args=(10,)).start()
suppress = threading.Thread(target=stop_job).start()
suppress = lock.acquire()
print(result)

假設您編寫了以下代碼:

from pyspark import SparkContext

sc = SparkContext("local", "Simple App")

# This will stop your app
sc.stop()

作為文檔中的描述: http ://spark.apache.org/docs/latest/api/python/pyspark.html?highlight=stop#pyspark.SparkContext.stop

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM