DML: Top 6 ML Platform Features You Must Know to Build an ML System
Why serving an ML model using a batch architecture is so powerful? Top 6 ML platform features you must know.
Hello there, I am Paul Iusztin ๐๐ผ
Within this newsletter, I will help you decode complex topics about ML & MLOps one week at a time ๐ฅ
This week we will cover:
Top 6 ML platform features you must know to build an ML system
Why serving an ML model using a batch architecture is so powerful?
Story: โI never forget anythingโ - said no one but your second brain.
This week, no shameless promotion ๐
#1. Top 6 ML platform features you must know to build an ML system
Here they are โ
#๐ญ. ๐๐
๐ฝ๐ฒ๐ฟ๐ถ๐บ๐ฒ๐ป๐ ๐ง๐ฟ๐ฎ๐ฐ๐ธ๐ถ๐ป๐ด
In your ML development phase, you generate lots of experiments.
Tracking and comparing the metrics between them is crucial in finding the optimal model.
#๐ฎ. ๐ ๐ฒ๐๐ฎ๐ฑ๐ฎ๐๐ฎ ๐ฆ๐๐ผ๐ฟ๐ฒ
Its primary purpose is reproducibility.
To know how a model was generated, you need to know:
- the version of the code
- the version of the packages
- hyperparameters/config
- total compute
- version of the dataset
... and more
#๐ฏ. ๐ฉ๐ถ๐๐๐ฎ๐น๐ถ๐๐ฎ๐๐ถ๐ผ๐ป๐
Most of the time, along with the metrics, you must log a set of visualizations for your experiment.
Such as:
- images
- videos
- prompts
- t-SNE graphs
- 3D point clouds
... and more
#๐ฐ. ๐ฅ๐ฒ๐ฝ๐ผ๐ฟ๐๐
You don't work in a vacuum.
You have to present your work to other colleges or clients.
A report lets you take the metadata and visualizations from your experiment...
...and create, deliver and share a targeted presentation for your clients or peers.
#๐ฑ. ๐๐ฟ๐๐ถ๐ณ๐ฎ๐ฐ๐๐
The most powerful feature out of them all.
An artifact is a versioned object that is an input or output for your task.
Everything can be an artifact, but the most common cases are:
- data
- model
- code
Wrapping your assets around an artifact ensures reproducibility.
For example, you wrap your features into an artifact (e.g., features:3.1.2), which you can consume into your ML development step.
The ML development step will generate config (e.g., config:1.2.4) and code (e.g., code:1.0.2) artifacts used in the continuous training pipeline.
Doing so lets you quickly respond to questions such as "What I used to generate the model?" and "What Version?"
#๐ฒ. ๐ ๐ผ๐ฑ๐ฒ๐น ๐ฅ๐ฒ๐ด๐ถ๐๐๐ฟ๐
The model registry is the ultimate way to make your model accessible to your production ecosystem.
For example, in your continuous training pipeline, after the model is trained, you load the weights as an artifact into the model registry (e.g., model:1.2.4).
You label this model as "staging" under a new version and prepare it for testing. If the tests pass, mark it as "production" under a new version and prepare it for deployment (e.g., model:2.1.5).
.
All of these features are used in a mature ML system. What is your favorite one?
โณ You can see all these features in action in my: ๐ The Full Stack 7-Steps MLOps Framework FREE course.
#2. Why serving an ML model using a batch architecture is so powerful?
When you first start deploying your ML model, you want an initial end-to-end flow as fast as possible.
Doing so lets you quickly provide value, get feedback, and even collect data.
.
But here is the catch...
Successfully serving an ML model is tricky as you need many iterations to optimize your model to work in real-time:
- low latency
- high throughput
Initially, serving your model in batch mode is like a hack.
By storing the model's predictions in dedicated storage, you automatically move your model from offline mode to a real-time online model.
Thus, you no longer have to care for your model's latency and throughput. The consumer will directly load the predictions from the given storage.
๐๐ก๐๐ฌ๐ ๐๐ซ๐ ๐ญ๐ก๐ ๐ฆ๐๐ข๐ง ๐ฌ๐ญ๐๐ฉ๐ฌ ๐จ๐ ๐ ๐๐๐ญ๐๐ก ๐๐ซ๐๐ก๐ข๐ญ๐๐๐ญ๐ฎ๐ซ๐:
- extracts raw data from a real data source
- clean, validate, and aggregate the raw data within a feature pipeline
- load the cleaned data into a feature store
- experiment to find the best model + transformations using the data from the feature store
- upload the best model from the training pipeline into the model registry
- inside a batch prediction pipeline, use the best model from the model registry to compute the predictions
- store the predictions in some storage
- the consumer will download the predictions from the storage
- repeat the whole process hourly, daily, weekly, etc. (it depends on your context)
.
๐๐ฉ๐ฆ ๐ฎ๐ข๐ช๐ฏ ๐ฅ๐ฐ๐ธ๐ฏ๐ด๐ช๐ฅ๐ฆ of deploying your model in batch mode is that the predictions will have a level of lag.
For example, in a recommender system, if you make your predictions daily, it won't capture a user's behavior in real-time, and it will update the predictions only at the end of the day.
Moving to other architectures, such as request-response or streaming, will be natural after your system matures in batch mode.
So remember, when you initially deploy your model, using a batch mode architecture will be your best shot for a good user experience.
Story: โI never forget anythingโ - said no one but your second brain.
After 6+ months of refinement, this is my second brain strategy ๐
Tiago's Forte book inspired me, but I adapted his system to my needs.
.
#๐ฌ. ๐๐ผ๐น๐น๐ฒ๐ฐ๐
This is where you are bombarded with information from all over the place.
#๐ญ. ๐ง๐ต๐ฒ ๐๐ฟ๐ฎ๐๐ฒ๐๐ฎ๐ฟ๐ฑ
This is where I save everything that looks interesting.
I won't use 90% of what is here, but it satisfied my urge to save that "cool article" I saw on LinkedIn.
Tools: Mostly Browser Bookmarks, but I rarely use GitHub stars, Medium lists, etc.
#๐ฎ. ๐ง๐ต๐ฒ ๐๐ผ๐ฎ๐ฟ๐ฑ
Here, I start converging the information and planning what to do next.
Tools: Notion
#๐ฏ. ๐ง๐ต๐ฒ ๐๐ถ๐ฒ๐น๐ฑ
Here is where I express myself through learning, coding, writing, etc.
Tools: whatever you need to express yourself.
2 & 3 are iterative processes. Thus I often bounce between them until the information is distilled.
#๐ฐ. ๐ง๐ต๐ฒ ๐ช๐ฎ๐ฟ๐ฒ๐ต๐ผ๐๐๐ฒ
Here is where I take the distilled information and write it down for cold storage.
Tools: Notion, Google Drive
.
When I want to search for a piece of information, I start from the Warehouse and go backward until I find what I need.
As a minimalist, I kept my tools to a minimum. I primarily use only: Brave, Notion, and Google Drive.
You don't need 100+ tools to be productive. They just want to take your money from you.
So remember...
You have to:
- collect
- link
- plan
- distill
- store
Thatโs it for today ๐พ
See you next Thursday at 9:00 am CET.
Have a fantastic weekend!
Paul
Whenever youโre ready, here is how I can help you:
The Full Stack 7-Steps MLOps Framework: a 7-lesson FREE course that will walk you step-by-step through how to design, implement, train, deploy, and monitor an ML batch system using MLOps good practices. It contains the source code + 2.5 hours of reading & video materials on Medium.
Machine Learning & MLOps Blog: here, I approach in-depth topics about designing and productionizing ML systems using MLOps.
Machine Learning & MLOps Hub: a place where I will constantly aggregate all my work (courses, articles, webinars, podcasts, etc.),
Hello Paul!
Great newsletter. It'd be even more useful to suggest tools for each of these features (e.g. the model registry, the feature store, etc)