Ensemble Models with Triton and KServe

An ensemble model represents a pipeline of one or more models and the connection of input and output tensors between those models. Ensemble models are intended to be used to encapsulate a procedure that involves multiple models, such as “data preprocessing -> inference -> data postprocessing”. Using ensemble models for this purpose can avoid the overhead of transferring intermediate tensors and minimize the number of requests that must be sent to Triton. Ref.

This is a simple example on how to deploy and use Ensemble models in OpenShift AI with the Triton runtime.

Requirements

Triton must be deployed as a custom Single Model Serving Runtime in OpenShift AI. Two examples are provided (you can deploy both if you want to test different configurations):
- A REST interface version: runtime-rest.yaml
- A gRPC interface version: runtime-grpc.yaml
An Ensemble model. An example is available in the model01 folder. Please note that this model is just a stub that does mostly nothing except going trough different steps to illustrate how Ensemble works in Triton.

Deployment

Copy the whole content of the model folder (so normally multiple models, plus the Ensemble definition) to an object store bucket.
In OpenShift AI, create a Data Connection pointing to the bucket.
Serve the model in OpenShift AI using the custom runtime you imported, pointing it to the data connection.
After a few seconds/minutes, the model is served and an inference endpoint is available.

Usage

Two example notebooks, test-ensemble-rest.ipynb and test-ensemble-grpc.ipynb are provided as examples to connect to the model using either gRPC or REST.

Please note that this model is just a stub that does mostly nothing except going trough different steps to illustrate how Ensemble works in Triton.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
model01/xgb_v2		model01/xgb_v2
runtime		runtime
.gitignore		.gitignore
Readme.md		Readme.md
test-ensemble-grpc.ipynb		test-ensemble-grpc.ipynb
test-ensemble-rest.ipynb		test-ensemble-rest.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ensemble Models with Triton and KServe

Requirements

Deployment

Usage

About

Releases

Packages

Contributors 2

Languages

rh-aiservices-bu/kserve-triton-ensemble-testing

Folders and files

Latest commit

History

Repository files navigation

Ensemble Models with Triton and KServe

Requirements

Deployment

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages