DML: 8 types of MLOps tools that must be in your toolbelt to be a successful MLOps engineer
How to successfully present MLOps ideas to upper management. How I generated PyDocs for 100 Python functions in <1 hour
Hello there, I am Paul Iusztin ๐๐ผ
Within this newsletter, I will help you decode complex topics about ML & MLOps one week at a time ๐ฅ
The last Hands-on LLM series finished last week. In case you are curious, here are the top 3 out of 9 lessons of the series:
Lesson 6: What do you need to fine-tune an open-source LLM to create your financial advisor?
Lesson 7: How do you generate a Q&A dataset in <30 minutes to fine-tune your LLMs?
Lesson 4: How to implement a streaming pipeline to populate a vector DB for real-time RAG?
This weekโs topics:
8 types of MLOps tools that must be in your toolbelt to be a successful MLOps engineer
How to successfully present MLOps ideas to upper management
How I generated PyDocs for 100 Python functions in <1 hour
โ Before diving into the topics, I have one important thing to share with you.
We finally finished the code & video lessonsย for the Hands-on LLMsย course ๐ฅ
By finishing the Hands-On LLMs free course, you will learn how to use the 3-pipeline architecture & LLMOps good practices to design, build, and deploy a real-time financial advisor powered by LLMs & vector DBs.
We will primarily focus on the engineering & MLOps aspects.
Thus, by the end of this series, you will know how to build & deploy a real ML system, not some isolated code in Notebooks.
๐๐จ๐ซ๐ ๐ฉ๐ซ๐๐๐ข๐ฌ๐๐ฅ๐ฒ, ๐ญ๐ก๐๐ฌ๐ ๐๐ซ๐ ๐ญ๐ก๐ 3 ๐๐จ๐ฆ๐ฉ๐จ๐ง๐๐ง๐ญ๐ฌ ๐ฒ๐จ๐ฎ ๐ฐ๐ข๐ฅ๐ฅ ๐ฅ๐๐๐ซ๐ง ๐ญ๐จ ๐๐ฎ๐ข๐ฅ๐:
1. a ๐ซ๐๐๐ฅ-๐ญ๐ข๐ฆ๐ ๐ฌ๐ญ๐ซ๐๐๐ฆ๐ข๐ง๐ ๐ฉ๐ข๐ฉ๐๐ฅ๐ข๐ง๐ (deployed on AWS) that listens to financial news, cleans & embeds the documents, and loads them to a vector DB
2. a ๐๐ข๐ง๐-๐ญ๐ฎ๐ง๐ข๐ง๐ ๐ฉ๐ข๐ฉ๐๐ฅ๐ข๐ง๐ (deployed as a serverless continuous training) that fine-tunes an LLM on financial data using QLoRA, monitors the experiments using an experiment tracker and saves the best model to a model registry
3. an ๐ข๐ง๐๐๐ซ๐๐ง๐๐ ๐ฉ๐ข๐ฉ๐๐ฅ๐ข๐ง๐ built in LangChain (deployed as a serverless RESTful API) that loads the fine-tuned LLM from the model registry and answers financial questions using RAG (leveraging the vector DB populated with financial news in real-time)
We will also show you how to integrate various serverless tools, such as:
โข Comet ML as your ML Platform;
โข Qdrant as your vector DB;
โข Beam as your infrastructure.
๐๐ก๐จ ๐ข๐ฌ ๐ญ๐ก๐ข๐ฌ ๐๐จ๐ซ?
The series targets MLE, DE, DS, or SWE who want to learn to engineer LLM systems using LLMOps good principles.
๐๐จ๐ฐ ๐ฐ๐ข๐ฅ๐ฅ ๐ฒ๐จ๐ฎ ๐ฅ๐๐๐ซ๐ง?
The series contains 4 hands-on video lessons and the open-source code you can access on GitHub.
๐๐ฎ๐ซ๐ข๐จ๐ฎ๐ฌ?
โณ ๐ Check it out and support us with a โญ
#1. 8 types of MLOps tools that must be in your toolbelt to be a successful MLOps engineer
These are the ๐ด ๐๐๐ฝ๐ฒ๐ of ๐ ๐๐ข๐ฝ๐ ๐๐ผ๐ผ๐น๐ that must be in your toolbelt to be a ๐๐๐ฐ๐ฐ๐ฒ๐๐๐ณ๐๐น ๐ ๐๐ข๐ฝ๐ ๐ฒ๐ป๐ด๐ถ๐ป๐ฒ๐ฒ๐ฟ โ
If you are into MLOps, you are aware of the 1000+ tools in the space and think you have to know.
The reality is that all of these tools can be boiled down to 8 main categories.
If you learn the fundamentals and master one tool from each category, you will be fine.
.
1. ๐๐๐ง๐จ๐๐ค๐ฃ ๐๐ค๐ฃ๐ฉ๐ง๐ค๐ก: crucial for the traceability and reproducibility of an ML model deployment or run. Without a version control system, it is difficult to find out what exact code version was responsible for specific runs or errors you might have in production. (๐ง GitHub, GitLab, etc.)
2. ๐พ๐/๐พ๐ฟ: automated tests are triggered upon pull request creation & deployment to production should only occur through the CD pipeline (๐ง GitHub Actions, GitLab CI/CD, Jenkins, etc.)
3. ๐๐ค๐ง๐ ๐๐ก๐ค๐ฌ ๐ค๐ง๐๐๐๐จ๐ฉ๐ง๐๐ฉ๐๐ค๐ฃ: manage complex dependencies between different tasks, such as data preprocessing, feature engineering, ML model training (๐ง Airflow, ZenML, AWS Step Functions, etc.)
4. ๐๐ค๐๐๐ก ๐ง๐๐๐๐จ๐ฉ๐ง๐ฎ: store, version, and share trained ML model artifacts, together with additional metadata (๐ง Comet ML, W&B, MLFlow, etc.)
5. ๐ฟ๐ค๐๐ ๐๐ง ๐ง๐๐๐๐จ๐ฉ๐ง๐ฎ: store, version, and share Docker images. Basically, all your code will be wrapped up in Docker images and shared through this registry (๐ง Docker Hub, ECR, etc.)
6 & 7. ๐๐ค๐๐๐ก ๐ฉ๐ง๐๐๐ฃ๐๐ฃ๐ & ๐จ๐๐ง๐ซ๐๐ฃ๐ ๐๐ฃ๐๐ง๐๐จ๐ฉ๐ง๐ช๐๐ฉ๐ช๐ง๐: if on-premise, you will likely have to go with Kubernetes. There are multiple choices if you are on a cloud provider: Azure ML on Azure, Sagemaker on AWS, and Vertex AI on GCP.
8. ๐๐ค๐ฃ๐๐ฉ๐ค๐ง๐๐ฃ๐: Monitoring in ML systems goes beyond what is needed for monitoring regular software applications. The distinction lies in that the model predictions can fail even if all typical health metrics appear in good condition. (๐ง SageMaker, NannyML, Arize, etc.)
The secret sauce in MLOps is knowing how to glue all these pieces together while keeping things simple.
โณ๐ To read more about these components, check out the article on
.#2. How to successfully present MLOps ideas to upper management
Have you ever presented your MLOps ideas to upper management just to get ghosted?
In that case...
Here are the 6 steps you have to know โ
1. ๐๐จ๐ฅ๐ฅ๐๐๐ญ ๐๐ฅ๐ฅ ๐ญ๐ก๐ ๐ฉ๐๐ข๐ง ๐ฉ๐จ๐ข๐ง๐ญ๐ฌ
Talk to data scientists, product owners, and stakeholders in your organization to gather issues such as:
- time to deployment
- poor quality deployment
- non-existing monitoring
- lack of collaboration
- external parties
2. ๐๐๐ฎ๐๐๐ญ๐ ๐ฉ๐๐จ๐ฉ๐ฅ๐
Organize workshops, meetings, etc., to present what MLOps is and how it can help.
I think it's critical to present it to your target audience. For example, an engineer looks at the problem differently than the business stakeholders.
3. ๐๐ซ๐๐ฌ๐๐ง๐ญ ๐๐๐๐จ๐ซ๐ ๐๐ง๐ ๐๐๐ญ๐๐ซ ๐ฌ๐๐๐ง๐๐ซ๐ข๐จ๐ฌ
Show how MLOps can solve the company's challenges and deliver tangible benefits to the organization, such as:
- less cost
- fast deployment
- better collaboration
- less risk
4. ๐๐ซ๐จ๐ฏ๐ ๐ข๐ญ
Use concrete examples to support your ideas, such as:
- how a competitor or an organization in the same or related field benefited from introducing MLOps
- build a PoC within your organization
5. ๐๐๐ญ ๐ฎ๐ฉ ๐ฒ๐จ๐ฎ๐ซ ๐ญ๐๐๐ฆ
Choose 2-3 experienced individuals (not juniors) to set up the foundations in your team/organization.
With an emphasis on starting with experienced engineers and only later bringing more juniors to the party.
6. ๐๐๐๐ฉ ๐จ๐ง ๐ค๐๐๐ฉ๐ข๐ง' ๐จ๐ง
Once you successfully apply MLOps to one use case, you can bring in more responsibility by growing your team and taking on more projects.
.
All of these are great tips for integrating MLOps in your organization.
I love their "Present before and after scenarios" approach.
You can extrapolate this strategy for any other new processes (not only MLOps).
.
โณ๐ To learn the details, check out the full article on .
#3. How I generated PyDocs for 100 Python functions in <1 hour
The most boring programming part is to write PyDocs, so I usually write clean code and let it speak for itself.
But, for open-source projects where you have to generate robust documentation, PyDocs are a must.
The good news is that now you can automate this process using Copilot.
You can see in the video below an example of how easy it is.
I tested it on more complex functions/classes, and it works well. I chose this example because it fits nicely on one screen.
Once I tested Copilot's experience, I will never go back.
It is true that, in some cases, you have to make some minor adjustments. But that is still 10000% more efficient than writing it from scratch.
If you want more examples, check out our Hands-on LLMs course, where all the PyDocs are generated 99% using Copilot in <1 hour.
Thatโs it for today ๐พ
See you next Thursday at 9:00 a.m. CET.
Have a fantastic weekend!
Paul
Whenever youโre ready, here is how I can help you:
The Full Stack 7-Steps MLOps Framework: a 7-lesson FREE course that will walk you step-by-step through how to design, implement, train, deploy, and monitor an ML batch system using MLOps good practices. It contains the source code + 2.5 hours of reading & video materials on Medium.
Machine Learning & MLOps Blog: in-depth topics about designing and productionizing ML systems using MLOps.
Machine Learning & MLOps Hub: a place where all my work is aggregated in one place (courses, articles, webinars, podcasts, etc.).