Very high latency running Python Flask app on gcloud app engine

Question

I have this small Python Flask app that gets a file posted as input and runs this file through a tensforflow keras model to come back with a prediction.

On my old laptop, running this locally it is superfast. The app consumes around 450MB or ram.

Now I have deployed this app to gcloud app engine, and I experience extremely high lantency, ranging from 1,900 to 3,500 ms. 1000x slower than on my own laptop, Not only is it slow. but it starts to much instances as well because of it.

I have tried with F2 and F4 instances (F1 doesn't provide enough memory), but it doesn't make a difference.

app.yaml

runtime: python37
env: standard
instance_class: F2
entrypoint: gunicorn -b :$PORT main:app

main.py

from flask import Flask
from flask_cors import CORS
from flask_restful import Api, Resource, reqparse, abort
from firebase import verifyToken, log
from model_manager import predict
import werkzeug, os
import tempfile

app = Flask(__name__)
CORS(app)
api = Api(app)

post_args = reqparse.RequestParser()
post_args.add_argument('file', type=werkzeug.datastructures.FileStorage, location='files', help="No file provided.", required=True)
post_args.add_argument('Authorization', type=str, location='headers', help="No auth token provided.", required=True)

class Analyze(Resource):
    def post(self):
        data = post_args.parse_args()

        if not verifyToken(data['Authorization']):
            abort(401, message="The user is not authorized to use this ")

        try:
            tf = os.path.join('tmp', tempfile.NamedTemporaryFile().name)
            data['file'].save(tf)
            result = predict(tf)
        except Exception as ex:
            abort(400, message=ex)
        finally:
            if tf:
                os.remove(tf)

        return result, 200

api.add_resource(Analyze, "/")

if __name__ == "__main__":
    app.run(debug=False)

Am I doing something wrong here that causes the high latency?

Answer 1

The best practice is to use tensorflow serving. Read https://www.tensorflow.org/tfx/guide/serving for more detail.

Very high latency running Python Flask app on gcloud app engine

Question

1 answers

solution1
0 2021-02-01 09:12:05

Very high latency running Python Flask app on gcloud app engine

Question

1 answers

solution1 0 2021-02-01 09:12:05

solution1
0 2021-02-01 09:12:05