简体   繁体   中英

How to increase transfer speed when POSTing an image to a Rest API

I am new to developing Rest APIs and trying to deploy a machine learning model for image segmentation using Python and Rest APIs.
On the server side I am using FastAPI while on the client side I use the Python requests library. The client already resizes the image to the necessary input size of the model and therefore doesn't send unneccessary large images. The server feeds the received image to the model and returns the binary segmentation mask. The image and the mask are converted from numpy arrays to lists which are then send as json data.
Below is some code, representing what I've just described. As I cannot provide the model here the server in this minimum reproducible example is just going to return the same image it received.

server.py

import uvicorn
from fastapi import FastAPI
import numpy as np
from datetime import datetime

app = FastAPI()

@app.get('/test')
def predict_and_process(data: dict = None):
    start = datetime.now()
    if data:
        image = np.asarray(data['image'])
        print("Time to run: ", datetime.now() - start)
        return {'prediction': np.squeeze(image).tolist()}
    else:
        return {'msg': "Model or data not available"}

def run():
    PORT = 27010
    uvicorn.run(
        app,
        host="127.0.0.1", 
        port=PORT,
    )


if __name__=='__main__':
    run()

client.py

import requests
import numpy as np
import json
from matplotlib.pyplot import imread 
from skimage.transform import resize
from datetime import datetime

def test_speed():
    path_to_img = r"path_to_some_image"
    
    image = imread(path_to_img)
    image = resize(image, (1024, 1024))
    img_list = image.tolist()

    data = {'image': img_list}
    start = datetime.now()
    respond = requests.get('http://127.0.0.1:27010/test', json=data)

    prediction = respond.json()['prediction']
    print("time for prediction: {}".format(datetime.now()-start))

if __name__=='__main__':
    test_speed()

The output from the server is:

(cera) PS C:\Users\user_name\Desktop\MRM\REST> python .\server.py
[32mINFO[0m:     Started server process [[36m20448[0m]
[32mINFO[0m:     Waiting for application startup.
[32mINFO[0m:     Application startup complete.
[32mINFO[0m:     Uvicorn running on [1mhttp://127.0.0.1:27010[0m (Press CTRL+C to quit)
Time to run:  0:00:00.337099
[32mINFO[0m:     127.0.0.1:61631 - "[1mGET /test HTTP/1.1[0m" [32m200 OK[0m

and the output from the client is:

(cera) PS C:\Users\user_name\Desktop\MRM\REST> python .\client.py
time for prediction: 0:00:16.845123

Since the code running on the server is less than a second, the time needed to transfer the image from the client to the server (or back) is somewhere around 8 seconds, which is definitely too long.
I can't send smaller images since the input size of the model needs to stay the same.

So for a deployment/REST newbie: what would be a professional / best-practice way to get my predictions from a REST API faster? I assume there are limits since I'm using python but 16 seconds still seems way too long to me.
Thank you in advance!

As @slizb pointed out, encoding the image to base64 makes everything so much faster. Instead of img_list = img.to_list() use

data = {'shape': image.shape, 'img': base64.b64encode(image.tobytes())}

and on the server

image = np.frombuffer(base64.b64decode(data.img)).reshape(data.shape)

Make sure to send the shape as well, because numpy isn't going to "remember" the shape from the buffer, so I needed to manually .reshape() the image.
The overall time went down to about 1 second which is mostly inference time of my model.

I would suggest reading through this documentation and trying the examples provided for your image upload route.

https://fastapi.tiangolo.com/tutorial/request-files/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM