Gunicorn / flask API暴露sklearn模型不起作用

Question

I can't seem to figure this out. 我似乎无法弄清楚这一点。 I've got a model trained with scikit-learn, saved to a .pkl file, and I want to make an API that would make predictions based on it. 我有一个使用scikit-learn训练的模型，保存到.pkl文件，我想制作一个基于它的预测API。

I already have the code that makes predictions and it runs fine from console/unit-tests. 我已经有了进行预测的代码，它可以从控制台/单元测试中运行良好。 To speed up predictions I'm splitting the data (thousands of image patches) and spreading the load using joblib / multiprocessing . 为了加快预测，我正在分割数据（数千个图像补丁）并使用joblib / multiprocessing扩展负载。

I'm setting JOBLIB_START_METHOD=forkserver since scikit-learn hangs if used from within a multiprocessing process. 我正在设置JOBLIB_START_METHOD=forkserver因为如果在multiprocessing处理过程中使用scikit-learn挂起。

I've got an API done with flask which uses this code, and when run with flask's dev server it works just fine. 我已经使用这个代码完成了使用flask的API，并且当使用flask的dev服务器运行时，它工作得很好。 Now I'm trying to host the flask app within gunicorn and it's not working at all. 现在我正试图在gunicorn flask托管flask应用程序，它根本不起作用。

If I use the default workers, then it just hangs with no errors when trying to predict, much like if I hadn't set the 'forkserver' multiprocessing. 如果我使用默认工作程序，那么它在尝试预测时只会挂起而没有错误，就像我没有设置'forkserver'多处理一样。 I'm running gunicorn like this: 我正在像这样运行gunicorn ：

JOBLIB_START_METHOD=forkserver gunicorn -w 2 -b 0.0.0.0:$PORT --timeout 3600 web.app:app

I also tried using the gevent backend. 我也尝试过使用gevent后端。 This actually does work but it's very slow, and it prints this: 这实际上确实有效，但它很慢，它打印出来：

Multiprocessing backed parallel loops cannot be nested below threads, setting n_jobs=1

So, any ideas on getting this to work in a way that there's multiple web workers running (I don't think that's the case with flask's dev server) and with a request being able to leverage joblib / multiprocessing ? 那么，任何关于让多个网络工作者运行的方法（我不认为这是烧瓶的开发服务器的情况）和一个能够利用joblib / multiprocessing的请求的joblib ？ thanks 谢谢

Answer 1

Gevent won't work with joblib since it spawns thread(s) to handle requests concurrently (Refer this discussion ) and that's what your warning actually says. Gevent将无法与joblib一起使用，因为它会生成线程以同时处理请求（请参阅此讨论），这就是您的警告实际所说的内容。 Secondly, it's very slow because joblib converts your parallel calls into sequential calls and executes them (Refer to this discussion ). 其次，它非常慢，因为joblib将并行调用转换为顺序调用并执行它们（请参阅此讨论）。

I did the face the same problem while performing parallelism using joblib. 使用joblib执行并行操作时，我遇到了同样的问题。 Although I didn't use sklearn, I think the following command should work for you as well: 虽然我没有使用sklearn，但我认为以下命令也适用于你：

gunicorn -b 0.0.0.0:$SERVICE_PORT --workers=2 -t $SERVICE_TIMEOUT rest_api:app

If you want to have a look at the complete source code, you can follow it here . 如果您想查看完整的源代码，可以在此处进行操作。

Gunicorn / flask API暴露sklearn模型不起作用

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-12-07 18:02:13

Gunicorn / flask API暴露sklearn模型不起作用

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-12-07 18:02:13

解决方案1
1 已采纳 2017-12-07 18:02:13