我们可以使用 FastAPI 直接在 model.predict() 内部使用 Pydantic 模型（Basemodel）吗？如果不能，为什么？

Question

I'm using Pydantic model ( Basemodel ) with FastAPI and converting the input into a dictionary , and then converting it into a Pandas DataFrame to assign it into model.predict() function for Machine Learning prediction, as shown below:我将 Pydantic model ( Basemodel ) 与 FastAPI 一起使用并将输入转换为dictionary ，然后将其转换为 Pandas DataFrame以将其分配给model.predict() function 进行机器学习预测，如下所示：

from fastapi import FastAPI
import uvicorn
from pydantic import BaseModel
import pandas as pd
from typing import List

class Inputs(BaseModel):
    f1: float,
    f2: float,
    f3: str

@app.post('/predict')
def predict(features: List[Inputs]):
    output = []

    # loop the list of input features
    for data in features:
         result = {}

         # Convert data into dict() and then into a DataFrame
            data = data.dict()
            df = pd.DataFrame([data])

         # get predictions
            prediction = classifier.predict(df)[0]

         # get probability
            probability = classifier.predict_proba(df).max()

         # assign to dictionary 
            result["prediction"] = prediction
            result["probability"] = probability

         # append dictionary to list (many outputs)
            output.append(result)

    return output

It works fine, I'm just not quite sure if it's optimized or the right way to do it, since I convert the input two times to get the predictions.它工作正常，我只是不太确定它是否经过优化或正确的方法，因为我将输入转换两次以获得预测。 Also, I'm not sure if it is going to work fast in the case of having a huge number of inputs.另外，我不确定在有大量输入的情况下它是否会快速工作。 Any improvements for this?有什么改进吗？ If there's a way (even other than using Pydantic models, where I can work directly and avoid going through conversions and the loop.如果有办法（即使不是使用 Pydantic 模型，我也可以直接工作并避免进行转换和循环。

Answer 1

First, you should use more descriptive names for your variables/objects.首先，您应该为变量/对象使用更具描述性的名称。 For example:例如：

@app.post('/predict')
def predict(inputs: List[Inputs]):
    for input in inputs:
    # ...

You cannot pass the Pydantic model directly to the predict() function, as it accepts a data array , not a Pydantic model. Available options are listed below.您不能将 Pydantic model 直接传递给predict() function，因为它接受数据array ，而不是 Pydantic model。下面列出了可用选项。

Option 1选项1

You could use:你可以使用：

prediction = model.predict([[input.f1, input.f2, input.f3]])[0]

Option 2选项 2

If you don't wish to use a Pandas DataFrame, as shown in your question, ie,如果您不想使用 Pandas DataFrame，如您的问题所示，即

df = pd.DataFrame([input.dict()])
prediction = model.predict(df)[0]

then, you could use the__dict__ method to get the values of all attributes in the model and convert it to a list :然后，您可以使用__dict__方法获取 model 中所有属性的值并将其转换为list ：

prediction = model.predict([list(input.__dict__.values())])[0]

or, preferably, use the Pydantic's .dict() method:或者，最好使用 Pydantic 的.dict()方法：

prediction = model.predict([list(input.dict().values())])[0]

Option 3选项 3

You could avoid looping over individual items and calling the predict() function multiple times, by using, instead, the below:您可以避免循环遍历单个项目并多次调用predict() function，方法是使用以下代码：

import pandas as pd

df = pd.DataFrame([i.dict() for i in inputs])
prediction = model.predict(df)
probability = model.predict_proba(df)
return {'prediction': prediction.tolist(), 'probability': probability.tolist()}

or (in case you don't wish using Pandas DataFrame):或者（如果您不想使用 Pandas DataFrame）：

inputs_list = [list(i.dict().values()) for i in inputs]
prediction = model.predict(inputs_list)
probability = model.predict_proba(inputs_list)
return {'prediction': prediction.tolist(), 'probability': probability.tolist()}

我们可以使用 FastAPI 直接在 model.predict() 内部使用 Pydantic 模型（Basemodel）吗？如果不能，为什么？

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-04-12 22:37:44

Option 1选项1

Option 2选项 2

Option 3选项 3

我们可以使用 FastAPI 直接在 model.predict() 内部使用 Pydantic 模型（Basemodel）吗？如果不能，为什么？

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-04-12 22:37:44

Option 1选项1

Option 2选项 2

Option 3选项 3

解决方案1
0 已采纳 2022-04-12 22:37:44