简体   繁体   中英

Very high latency running Python Flask app on gcloud app engine

I have this small Python Flask app that gets a file posted as input and runs this file through a tensforflow keras model to come back with a prediction.

On my old laptop, running this locally it is superfast. The app consumes around 450MB or ram.

Now I have deployed this app to gcloud app engine, and I experience extremely high lantency, ranging from 1,900 to 3,500 ms. 1000x slower than on my own laptop, Not only is it slow. but it starts to much instances as well because of it.

I have tried with F2 and F4 instances (F1 doesn't provide enough memory), but it doesn't make a difference.

app.yaml

runtime: python37
env: standard
instance_class: F2
entrypoint: gunicorn -b :$PORT main:app

main.py

from flask import Flask
from flask_cors import CORS
from flask_restful import Api, Resource, reqparse, abort
from firebase import verifyToken, log
from model_manager import predict
import werkzeug, os
import tempfile

app = Flask(__name__)
CORS(app)
api = Api(app)

post_args = reqparse.RequestParser()
post_args.add_argument('file', type=werkzeug.datastructures.FileStorage, location='files', help="No file provided.", required=True)
post_args.add_argument('Authorization', type=str, location='headers', help="No auth token provided.", required=True)

class Analyze(Resource):
    def post(self):
        data = post_args.parse_args()

        if not verifyToken(data['Authorization']):
            abort(401, message="The user is not authorized to use this ")

        try:
            tf = os.path.join('tmp', tempfile.NamedTemporaryFile().name)
            data['file'].save(tf)
            result = predict(tf)
        except Exception as ex:
            abort(400, message=ex)
        finally:
            if tf:
                os.remove(tf)

        return result, 200

api.add_resource(Analyze, "/")

if __name__ == "__main__":
    app.run(debug=False)

Am I doing something wrong here that causes the high latency?

The best practice is to use tensorflow serving. Read https://www.tensorflow.org/tfx/guide/serving for more detail.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM