简体   繁体   中英

How to safely shutdown mlflow ui?

After running mlflow ui on a remote server, I'm unable to reopen the mlflow ui again.
A workaround is to kill all my processes in the server using pkill -u MyUserName .
Otherwise I get the following error:

[INFO] Starting gunicorn 20.0.4  
[ERROR] Connection in use: ('127.0.0.1', 5000)
[ERROR] Retrying in 1 second.  
...
Running the mlflow server failed. Please see ther logs above for details.

I understand the error but I don't understand:
1. What is the correct way to shutdown mlflow ui
2. How can I identify the mlflow ui process in order to only kill that process and not use the pkill

Currently I close the browser or use ctrl+C

I also met a similar problem recently when I call mlflow ui in the remote server. The Ctrl + C in the command line to exit usually works. However, When it doesn't, using pkill -f gunicorn solves my problem. Note, you can also use ps -A | grep gunicorn ps -A | grep gunicorn to first find the process and kill [PID] manually. A similar problem seems to have been discussed here once.

If u cant connect to mlflow its bc its already running, u can run the following to kill the UI to spawn another one:

lsof -i :5000

Also, with MLFlow u can use -port to assign a port number u want to prevent confusion if you need multiple UI's launched; eg one for tracking, one for serving etc. By default the server runs on port 5000. If that port is already in use, use the –port option to specify a different port:

mlflow models serve -m runs:/<RUN_ID>/model --port 1234

UPDATE June 2022: You can add the --port flag to this cmd here to properly set up MLFlow: How do you start using MLflow SQL storage instead of the file system storage?

Quick solution:

Simply kill the process

fuser -k 5000/tcp

Command syntax

fuser -k <port>/tcp

Bonus: fuser 5000/tcp will print you PID of process bound on that port.

Note: Works on Linux only. More universal is use of lsof -i4 (or 6 for IPv6).

I was getting error on mlflow ui command.

Error was

[2022-04-19 10:48:02 -0400] [89933] [INFO] Starting gunicorn 20.1.0
[2022-04-19 10:48:02 -0400] [89933] [ERROR] Connection in use: ('127.0.0.1', 5000)
[2022-04-19 10:48:02 -0400] [89933] [ERROR] Retrying in 1 second.
[2022-04-19 10:48:03 -0400] [89933] [ERROR] Connection in use: ('127.0.0.1', 5000)
[2022-04-19 10:48:03 -0400] [89933] [ERROR] Retrying in 1 second.
[2022-04-19 10:48:04 -0400] [89933] [ERROR] Connection in use: ('127.0.0.1', 5000)
[2022-04-19 10:48:04 -0400] [89933] [ERROR] Retrying in 1 second.
[2022-04-19 10:48:05 -0400] [89933] [ERROR] Connection in use: ('127.0.0.1', 5000)
[2022-04-19 10:48:05 -0400] [89933] [ERROR] Retrying in 1 second.
[2022-04-19 10:48:06 -0400] [89933] [ERROR] Connection in use: ('127.0.0.1', 5000)
[2022-04-19 10:48:06 -0400] [89933] [ERROR] Retrying in 1 second.
[2022-04-19 10:48:07 -0400] [89933] [ERROR] Can't connect to ('127.0.0.1', 5000)

Solution that worked for me:

Step 1: Get the process id

ps -A | grep gunicorn

20734 ?? 0:39.17 /usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/Resources/Python.app/Contents/MacOS/Python /Users/XXX/env/bin/gunicorn -b 127.0.0.1:5000 -w 1 mlflow.server:app

Step 2: Take the PID from last output and kill the process with that PID that is using the port

kill 20734

By default, the mlflow UI binds to port 5000, so the subsequent invocation will result in a port busy error.

You can launch multiple MLflow ui and provide a different port numbers:

Usage: mlflow ui [OPTIONS]

  Launch the MLflow tracking UI for local viewing of run results. To launch
  a production server, use the "mlflow server" command instead.

  The UI will be visible at http://localhost:5000 by default, and only
  accept connections from the local machine. To let the UI server accept
  connections from other machines, you will need to pass ``--host 0.0.0.0``
  to listen on all network interfaces (or a specific interface address).

Options:
  --backend-store-uri PATH     URI to which to persist experiment and run
                               data. Acceptable URIs are SQLAlchemy-compatible
                               database connection strings (e.g.
                               'sqlite:///path/to/file.db') or local
                               filesystem URIs (e.g.
                               'file:///absolute/path/to/directory'). By
                               default, data will be logged to the ./mlruns
                               directory.
  --default-artifact-root URI  Path to local directory to store artifacts, for
                               new experiments. Note that this flag does not
                               impact already-created experiments. Default:
                               ./mlruns
  -p, --port INTEGER           The port to listen on (default: 5000).
  -h, --host HOST              The network address to listen on (default:
                               127.0.0.1). Use 0.0.0.0 to bind to all
                               addresses if you want to access the tracking
                               server from other machines.
  --help                       Show this message and exit.```

Try it and see what happens.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM