简体   繁体   中英

Heroku Dyno Crash after Tensorflow Serving container enters the Event Loop

I'm trying to deploy my Tensorflow Model using Docker, Tensorflow Serving and Heroku. Everything goes fine, but when the TF Serving Container is ending the initialization (when it outputs "Entering the event loop") the Heroku Web Dyno suddenly crashes. Then it restarts and tries again, but when it reaches the Event Loop again, it crashes. The third time, Heroku simply never spins up the dyno again.

First, I just deploy the image, no problem:

C:\Users\whitm\Desktop\CodeProjects\deep-deblurring-serving>heroku container:release web
Releasing images web to deep-deblurring-serving... done

C:\Users\whitm\Desktop\CodeProjects\deep-deblurring-serving>heroku ps
Free dyno hours quota remaining this month: 550h 0m (100%)
Free dyno usage for this app: 0h 0m (0%)
For more information on dyno sleeping and how to upgrade, see:
https://devcenter.heroku.com/articles/dyno-sleeping

=== web (Free): /usr/bin/tf_serving_entrypoint.sh (1)
web.1: starting 2020/04/10 15:36:38 -0400 (~ 6s ago)

After one minute initializing, (When tf serving reach the event loop) the dyno crashes:

2020-04-10T19:36:53.234387+00:00 app[web.1]: [evhttp_server.cc : 238] NET_LOG: Entering the event loop ...
2020-04-10T19:36:53.234389+00:00 app[web.1]: 2020-04-10 19:36:53.234341: I tensorflow_serving/model_servers/server.cc:378] Exporting HTTP/REST API at:localhost:8501 ..
.
2020-04-10T19:37:46.597354+00:00 heroku[web.1]: State changed from starting to crashed
2020-04-10T19:37:46.602976+00:00 heroku[web.1]: State changed from crashed to starting

Then, it is restarted by Heroku automatically.

C:\Users\whitm\Desktop\CodeProjects\deep-deblurring-serving>heroku ps
Free dyno hours quota remaining this month: 550h 0m (100%)
Free dyno usage for this app: 0h 0m (0%)
For more information on dyno sleeping and how to upgrade, see:
https://devcenter.heroku.com/articles/dyno-sleeping

=== web (Free): /usr/bin/tf_serving_entrypoint.sh (1)
web.1: restarting 2020/04/10 15:37:46 -0400 (~ 45s ago)

The cycle keeps going three times, the last one, Heroku stops restarting the dyno:

C:\Users\whitm\Desktop\CodeProjects\deep-deblurring-serving>heroku ps
Free dyno hours quota remaining this month: 550h 0m (100%)
Free dyno usage for this app: 0h 0m (0%)
For more information on dyno sleeping and how to upgrade, see:
https://devcenter.heroku.com/articles/dyno-sleeping

=== web (Free): /usr/bin/tf_serving_entrypoint.sh (1)
web.1: crashed 2020/04/10 15:38:53 -0400 (~ 3m ago)

This is not a problem with the container, it is working like a charm locally, it reaches the event loop and starts listening to incoming requests. I can make a request without a problem. So the problem is on Heroku, but I don't know what is going on. I feel it is related to Heroku interpret the container as a non-responsive application? I don't know. The worst of the case is, I cant SSH into the container if the dyno is not in "running" state, this state is never reached, because it crashes during initialization.

There is the last thing, the container uses 448MB of RAM locally, and Heroku free Dynos have 500MB, I think it is crashing due to memory, but again, I can't get in for check what is going on.

What can I do, where can I see?

Thanks in advance!

PD: I tried running a lighter model, that locally uses 20MB of RAM, but the result was the same on Heroku, the Dyno crashes.

I solve the problem. It was caused by a Container Port mismatching. Basically, Tensorflow Serving was trying to use default 8501 port for the rest API, but actually, Heroku assigned a different port to expose the container. The solution was to tell the tensorFlow model server and update the /usr/bin/tf_serving_entrypoint.sh file, to use the ports assigned by Heroku.

This is the new Dockerfile:

FROM tensorflow/serving
LABEL maintainer="Whitman Bohorquez" description="Build tf serving based image. This repo must be used as build context"
COPY / /
RUN apt-get update && apt-get install -y git && git reset --hard
ENV MODEL_NAME=deblurrer MODEL_BASE_PATH=/models

RUN echo '#!/bin/bash \n\n\
tensorflow_model_server \
--rest_api_port=$PORT \
--model_name=${MODEL_NAME} \
--model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} \
"$@"' > /usr/bin/tf_serving_entrypoint.sh \
&& chmod +x /usr/bin/tf_serving_entrypoint.sh

# CMD is required to run on Heroku
CMD ["/usr/bin/tf_serving_entrypoint.sh"]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM