DML: How to deploy Deep Learning Models at…

Alex Razvant

Jan 25

All you need to know about NVIDIA Triton Inference Server, the whats, the hows and the whys plus some meme references.

Read →

5 Comments

Paul Iusztin

Jan 25Author

Great article!

1. How hard is adding a custom model on Triton?

2. Can the preprocessing be moved to the server side, not the client?

Expand full comment

Reply (1)

Alex Razvant

Jan 25Author

Thanks Paul !

1. Pretty easy:

1.1 If you want to add your custom model as a (PyTorch) .pt one, you'll just have to specify the INPUT/OUTPUT layers configurations in config.pbtxt.

1.2 If you want to convert/optimize your model to a specific framework (ONNX, TensorRT) you'll have to make sure your "custom Operators" used are supported by ONNX such that you can convert it to ONNX, and then TensorRT (if you wish). After that, write the config file as per point 1.1.

2. Yes, by using model Ensembles, you can consider the preprocessing-inference-postprocessing as a pipeline. For preprocessing - postprocessing you'll have to add Python Scripts to do just that, and then specify the "flow" within your config.pbtxt

I'll plan and write an article about this, since these are very good questions!

Expand full comment

Reply (1)