簡體   English   中英

您如何“永久”刪除 Mlflow 中的實驗?

[英]How Do You "Permanently" Delete An Experiment In Mlflow?

任何地方都沒有記錄永久刪除實驗。 我正在使用帶有后端 postgres 數據庫的 Mlflow

這是我運行的:

client = MlflowClient(tracking_uri=server)
client.delete_experiment(1)

這將刪除實驗,但是當我運行一個與我剛剛刪除的實驗同名的新實驗時,它將返回此錯誤:

mlflow.exceptions.MlflowException: Cannot set a deleted experiment 'cross-sell' as the active experiment. You can restore the experiment, or permanently delete the  experiment to create a new one.

我在顯示如何永久刪除所有內容的文檔中找不到任何地方。

不幸的是,目前似乎無法通過 UI 或 CLI 執行此操作:-/

執行此操作的方法取決於您使用的后端文件存儲的類型。

文件存儲

如果您使用文件系統作為存儲機制(默認),那么這很容易。 “已刪除”的實驗將移至.trash文件夾。 你只需要清除它:

rm -rf mlruns/.trash/*

截至當前版本的文檔(1.7.2),他們評論:

建議使用 cron 作業或替代工作流機制來清除.trash文件夾。

SQL 數據庫:

這更棘手,因為需要刪除依賴項。 我正在使用 MySQL,這些命令對我有用:

USE mlflow_db;  # the name of your database
DELETE FROM experiment_tags WHERE experiment_id=ANY(
    SELECT experiment_id FROM experiments where lifecycle_stage="deleted"
);
DELETE FROM latest_metrics WHERE run_uuid=ANY(
    SELECT run_uuid FROM runs WHERE experiment_id=ANY(
        SELECT experiment_id FROM experiments where lifecycle_stage="deleted"
    )
);
DELETE FROM metrics WHERE run_uuid=ANY(
    SELECT run_uuid FROM runs WHERE experiment_id=ANY(
        SELECT experiment_id FROM experiments where lifecycle_stage="deleted"
    )
);
DELETE FROM tags WHERE run_uuid=ANY(
    SELECT run_uuid FROM runs WHERE experiment_id=ANY(
        SELECT experiment_id FROM experiments where lifecycle_stage="deleted"
    )
);
DELETE FROM runs WHERE experiment_id=ANY(
    SELECT experiment_id FROM experiments where lifecycle_stage="deleted"
);
DELETE FROM experiments where lifecycle_stage="deleted";

從 mlflow 1.11.0 開始,在實驗中永久刪除運行的推薦方法是: mlflow gc [OPTIONS]

從文檔中, mlflow gc

從指定的后端存儲永久刪除已刪除生命周期階段中的運行。 此命令刪除與指定運行關聯的所有工件和元數據。

如果您想永久刪除 MLFlow 的垃圾箱(如果您使用 PostgreSQL 作為后端存儲),我將添加 SQL 命令。

更改到您的 MLFlow 數據庫,例如使用: \\c mlflow然后:

DELETE FROM experiment_tags WHERE experiment_id=ANY(
    SELECT experiment_id FROM experiments where lifecycle_stage='deleted'
);
DELETE FROM latest_metrics WHERE run_uuid=ANY(
    SELECT run_uuid FROM runs WHERE experiment_id=ANY(
        SELECT experiment_id FROM experiments where lifecycle_stage='deleted'
    )
);
DELETE FROM metrics WHERE run_uuid=ANY(
    SELECT run_uuid FROM runs WHERE experiment_id=ANY(
        SELECT experiment_id FROM experiments where lifecycle_stage='deleted'
    )
);
DELETE FROM tags WHERE run_uuid=ANY(
    SELECT run_uuid FROM runs WHERE experiment_id=ANY(
        SELECT experiment_id FROM experiments where lifecycle_stage='deleted'
    )
);
DELETE FROM params WHERE run_uuid=ANY(
    SELECT run_uuid FROM runs where experiment_id=ANY(
        SELECT experiment_id FROM experiments where lifecycle_stage='deleted'
));
DELETE FROM runs WHERE experiment_id=ANY(
    SELECT experiment_id FROM experiments where lifecycle_stage='deleted'
);
DELETE FROM experiments where lifecycle_stage='deleted';

不同之處在於,我在那里添加了“params”表 SQL 刪除命令。

擴展@Lee Netherton的回答,您可以使用PyMySQL執行這些查詢,並在從 MLFlow 跟蹤客戶端刪除實驗后從 MLFlow 跟蹤服務器中刪除所有元數據。

import pymysql

def perm_delete_exp():
    connection = pymysql.connect(
        host='localhost',
        user='user',
        password='password',
        db='mlflow',
        cursorclass=pymysql.cursors.DictCursor)
    with connection.cursor() as cursor:
        queries = """
            USE mlflow;
            DELETE FROM experiment_tags WHERE experiment_id=ANY(SELECT experiment_id FROM experiments where lifecycle_stage="deleted");
            DELETE FROM latest_metrics WHERE run_uuid=ANY(SELECT run_uuid FROM runs WHERE experiment_id=ANY(SELECT experiment_id FROM experiments where lifecycle_stage="deleted"));
            DELETE FROM metrics WHERE run_uuid=ANY(SELECT run_uuid FROM runs WHERE experiment_id=ANY(SELECT experiment_id FROM experiments where lifecycle_stage="deleted"));
            DELETE FROM tags WHERE run_uuid=ANY(SELECT run_uuid FROM runs WHERE experiment_id=ANY(SELECT experiment_id FROM experiments where lifecycle_stage="deleted"));
            DELETE FROM runs WHERE experiment_id=ANY(SELECT experiment_id FROM experiments where lifecycle_stage="deleted");
            DELETE FROM experiments where lifecycle_stage="deleted";
        """
        for query in queries.splitlines()[1:-1]:
            cursor.execute(query.strip())
    connection.commit()
    connection.close()

您可以(也許應該)一次執行整個查詢,但我發現通過這種方式調試更容易。

不幸的是,在我的例子中,上面的 SQL 命令不適用於 SQLITE。 這是在數據庫 IDE 中使用 sqlite 的 SQL 版本,將“any”命令替換為“in”:

DELETE FROM experiment_tags WHERE experiment_id in (
    SELECT experiment_id FROM experiments where lifecycle_stage='deleted'
    );
DELETE FROM latest_metrics WHERE run_uuid in (
    SELECT run_uuid FROM runs WHERE experiment_id in (
        SELECT experiment_id FROM experiments where lifecycle_stage='deleted'
    )
);
DELETE FROM metrics WHERE run_uuid in (
    SELECT run_uuid FROM runs WHERE experiment_id in (
        SELECT experiment_id FROM experiments where lifecycle_stage='deleted'
    )
);
DELETE FROM tags WHERE run_uuid in (
    SELECT run_uuid FROM runs WHERE experiment_id in (
        SELECT experiment_id FROM experiments where lifecycle_stage='deleted'
    )
);
DELETE FROM params WHERE run_uuid in (
    SELECT run_uuid FROM runs where experiment_id in (
        SELECT experiment_id FROM experiments where lifecycle_stage='deleted'
));
DELETE FROM runs WHERE experiment_id in (
    SELECT experiment_id FROM experiments where lifecycle_stage='deleted'
);
DELETE FROM experiments where lifecycle_stage='deleted';

如果您使用 S3 作為工件的后端存儲並且有一個用於跟蹤的 EC2 服務器,這是我刪除完整實驗“文件夾”的解決方法。

通過實驗 ID 列表刪除 S3 上的完整實驗:

def permanently_delete_mlflow_experiments(list_of_experiment_ids: list):
    # loop over the experiment ids you want to delete
    for experiment_id in list_of_experiment_ids:
        print(f'deleting experiment {experiment_id}')
        # run shell command for S3 deletion via aws s3 rm
        os.system(f"aws s3 rm YOUR_BUCKET_URI --recursive --exclude '*' --include '{experiment_id}/*'")

通過運行 ID 列表刪除特定運行:

def permanently_delete_runs_on_mlflow(list_of_runs_id: list):
    mlflow_client = MlflowClient(tracking_uri=YOUR_MLFLOW_TRACKING_URI)
    for run_id in list_of_runs_id:
        # retrieve experiment id corresponding to the run id
        experiment_id = mlflow_client.get_run(run_id).info.experiment_id
        print(f'deleting run {run_id} from experiment {experiment_id}')
        os.system(f"aws s3 rm YOUR_BUCKET_URI --recursive --exclude '*' --include '{experiment_id}/{run_id}/*'")

請注意,要使其正常工作,您需要安裝 AWS CLI

它基本上從 Python 運行一個 shell 命令來實現這個目的。 作為旁注,使用 EC2 的 mlflow 跟蹤在 S3 上創建了根據實驗 ID 命名的“文件夾”,其中包含與該實驗對應的每個運行 ID 的“子文件夾”。 上面的代碼依賴於這個結構。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM