简体   繁体   English

如何调用具有 Python 和 TensorFlow 服务 Z20F35E630DAF44DB84ZC3 的特定 model 版本?

[英]How do I call out to a specific model version with Python and a TensorFlow serving model?

I have a few machine learning models running via TensorFlow Serving on Kubernetes.我有一些机器学习模型通过 TensorFlow 在 Kubernetes 上运行。 I'd like to be able to have one deployment of a particular model, and then load multiple versions.我希望能够部署一个特定的 model,然后加载多个版本。

This seems like it would be easier than having to maintain a separate Kubernetes deployment for each version of each model that we have.这似乎比为我们拥有的每个 model 的每个版本维护单独的 Kubernetes 部署更容易。

But it's not obvious how to pass the version or model flavor I want to call using the Python gRPC interface to TF Serving.但是,如何将我想使用 Python gRPC 接口调用的版本或 model 风格传递给 TF Serving 并不明显。 How do I specify the version and pass it in?如何指定版本并传入?

For whatever reason, it's not possible to update the model spec in place as you're building the request to pull.无论出于何种原因,在您构建拉取请求时,都无法更新 model 规范。 Instead, you need to separately build an instance of ModelSpec that includes the version you want, and then pass that in to the constructor for the prediction request.相反,您需要单独构建一个包含所需版本的ModelSpec实例,然后将其传递给预测请求的构造函数。

Also worth pointing out you need to use the Google-specific Int64Value for the version.还值得指出的是,您需要为该版本使用 Google 特定的Int64Value

from google.protobuf.wrappers_pb2 import Int64Value
from tensorflow_serving.apis.model_pb2 import ModelSpec
from tensorflow_serving.apis import predict_pb2, get_model_metadata_pb2, \
                                    prediction_service_pb2_grpc
from tensorflow import make_tensor_proto
import grpc

model_name = 'mymodel'
input_name = 'model_input'
model_uri = 'mymodel.svc.cluster.local:8500'

X = # something that works

channel = grpc.insecure_channel(model_uri, options=MESSAGE_OPTIONS)
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)

version = Int64Value(value=1)
model_spec = ModelSpec(version=version, name=model_name, signature_name='serving_default')

request = predict_pb2.PredictRequest(model_spec=model_spec)
request.inputs[input_name].CopyFrom(make_tensor_proto(X.astype(np.float32), shape=X.shape))
result = stub.Predict(request, 1.0)
channel.close()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM