Fix your messy ML configs in your Python projects
2024 MLOps learning roadmap. Python syntax sugar that will help you write cleaner code.
Decoding ML Notes
This week our main focus will be a classic.
We will discuss Python.
More concretely how to write cleaner code and applications in Python. ๐ฅ
Is that even possible? ๐
This weekโs topics:
My favorite way to implement a configuration layer in Python
Some Python syntax sugar that will help you write cleaner code
2024 MLOps learning roadmap
Since creating content, I learned one crucial thing: "๐๐ท๐ฆ๐ณ๐บ๐ฃ๐ฐ๐ฅ๐บ ๐ญ๐ช๐ฌ๐ฆ๐ด ๐ต๐ฐ ๐ณ๐ฆ๐ข๐ฅ ๐ข๐ฏ๐ฅ ๐ญ๐ฆ๐ข๐ณ๐ฏ ๐ฅ๐ช๐ง๐ง๐ฆ๐ณ๐ฆ๐ฏ๐ต๐ญ๐บ."
Do you prefer to read content on Medium?
Then, you are in luck.
Decoding ML is also on Medium.
Substack vs. Medium?
On Medium, we plan to post more extended and detailed content, while on Substack, we will write on the same topics but in a shorter and more concentrated manner.
If you want more code and less talkingโฆ
Check out our Medium publication ๐
โโโ
โ ๐ Decoding ML Medium publication
My favorite way to implement a configuration layer in Python
This is my favorite way to ๐ถ๐บ๐ฝ๐น๐ฒ๐บ๐ฒ๐ป๐ a ๐ฐ๐ผ๐ป๐ณ๐ถ๐ด๐๐ฟ๐ฎ๐๐ถ๐ผ๐ป/๐๐ฒ๐๐๐ถ๐ป๐ด๐ ๐๐๐๐๐ฒ๐บ in ๐ฃ๐๐๐ต๐ผ๐ป for all my apps โ
The core is based on ๐ฑ๐บ๐ฅ๐ข๐ฏ๐ต๐ช๐ค, a data validation library for Python.
More precisely, on their ๐๐ข๐ด๐ฆ๐๐ฆ๐ต๐ต๐ช๐ฏ๐จ๐ด class.
๐ช๐ต๐ ๐๐๐ฒ ๐๐ต๐ฒ ๐ฝ๐๐ฑ๐ฎ๐ป๐๐ถ๐ฐ ๐๐ฎ๐๐ฒ๐ฆ๐ฒ๐๐๐ถ๐ป๐ด๐ ๐ฐ๐น๐ฎ๐๐?
- you can quickly load values from .๐ฆ๐ฏ๐ท files (or even ๐๐๐๐ or ๐ ๐๐๐)
- add default values for the configuration of your application
- the MOST IMPORTANT one โ It validates the type of the loaded variables. Thus, you will always be ensured you use the correct variables to configure your system.
๐๐ผ๐ ๐ฑ๐ผ ๐๐ผ๐ ๐ถ๐บ๐ฝ๐น๐ฒ๐บ๐ฒ๐ป๐ ๐ถ๐?
It is pretty straightforward.
You subclass the ๐๐ข๐ด๐ฆ๐๐ฆ๐ต๐ต๐ช๐ฏ๐จ๐ด class and define all your settings at the class level.
It is similar to a Python ๐ฅ๐ข๐ต๐ข๐ค๐ญ๐ข๐ด๐ด but with an extra layer of data validation and factory methods.
If you assign a value to the variable, it makes it optional.
If you leave it empty, providing it in your .๐๐ฃ๐ซ file is mandatory.
๐๐ผ๐ ๐ฑ๐ผ ๐๐ผ๐ ๐ถ๐ป๐๐ฒ๐ด๐ฟ๐ฎ๐๐ฒ ๐ถ๐ ๐๐ถ๐๐ต ๐๐ผ๐๐ฟ ๐ ๐ ๐ฐ๐ผ๐ฑ๐ฒ?
You often have a training configuration file (or inference) into a JSON or YAML file (I prefer YAML files as they are easier to read).
You shouldn't pollute your ๐ฑ๐บ๐ฅ๐ข๐ฏ๐ต๐ช๐ค settings class with all the hyperparameters related to the module (as they are a lot, A LOT).
Also, to isolate the application & ML settings, the easiest way is to add the ๐ต๐ณ๐ข๐ช๐ฏ๐ช๐ฏ๐จ_๐ค๐ฐ๐ฏ๐ง๐ช๐จ_๐ฑ๐ข๐ต๐ฉ in your settings and use a ๐๐ณ๐ข๐ช๐ฏ๐ช๐ฏ๐จ๐๐ฐ๐ฏ๐ง๐ช๐จ class to load it independently.
Doing so lets you leverage your favorite way (probably the one you already have in your ML code) of loading a config file for the ML configuration: plain YAML or JSON files, hydra, or other fancier methods.
Another plus is that you can't hardcode the path anywhere on your system. That is a nightmare when you start using git with multiple people.
What do you say? Would you start using the ๐ฑ๐บ๐ฅ๐ข๐ฏ๐ต๐ช๐ค ๐๐ข๐ด๐ฆ๐๐ฆ๐ต๐ต๐ช๐ฏ๐จ๐ด class in your ML applications?
Some Python syntax sugar that will help you write cleaner code
Here is some ๐ฃ๐๐๐ต๐ผ๐ป ๐๐๐ป๐๐ฎ๐ ๐๐๐ด๐ฎ๐ฟ that will help you ๐๐ฟ๐ถ๐๐ฒ ๐ฐ๐น๐ฒ๐ฎ๐ป๐ฒ๐ฟ ๐ฐ๐ผ๐ฑ๐ฒ โ
I am talking about the ๐ธ๐ข๐ญ๐ณ๐ถ๐ด ๐ฐ๐ฑ๐ฆ๐ณ๐ข๐ต๐ฐ๐ณ denoted by the `:=` symbol.
It was introduced in Python 3.8, but I rarely see it used.
Thus, as a "clean code" freak, I wanted to dedicate a post to it.
๐ช๐ต๐ฎ๐ ๐ฑ๐ผ๐ฒ๐ ๐๐ต๐ฒ ๐๐ฎ๐น๐ฟ๐๐ ๐ผ๐ฝ๐ฒ๐ฟ๐ฎ๐๐ผ๐ฟ ๐ฑ๐ผ?
It's an assignment expression that allows you to assign and return a value in the same expression.
๐ช๐ต๐ ๐๐ต๐ผ๐๐น๐ฑ ๐๐ผ๐ ๐๐๐ฒ ๐ถ๐?
๐๐ฐ๐ฏ๐ค๐ช๐ด๐ฆ๐ฏ๐ฆ๐ด๐ด: It reduces the number of lines needed for variable assignment and checking, making code more concise.
๐๐ฆ๐ข๐ฅ๐ข๐ฃ๐ช๐ญ๐ช๐ต๐บ: It can enhance readability by keeping related logic close, although this depends on the context and the reader's familiarity with exotic Python syntax.
๐๐๐ง๐ ๐๐ง๐ ๐จ๐ค๐ข๐ ๐๐ญ๐๐ข๐ฅ๐ก๐๐จ
โโโ
1. Using the walrus operator, you can directly assign the result of the ๐ญ๐ฆ๐ฏ() function inside an if statement.
2. Avoid calling the same function twice in a while loop. The benefit is less code and makes everything more readable.
3. Another use case arises in list comprehensions where a value computed in a filtering condition is also needed in the expression body. Before the ๐ธ๐ข๐ญ๐ณ๐ถ๐ด ๐ฐ๐ฑ๐ฆ๐ณ๐ข๐ต๐ฐ๐ณ, if you had to apply a function to an item from a list and filter it based on some criteria, you had to refactor it to a standard for loop.
.
When writing clean code, the detail matters.
The details make the difference between a codebase that can be read like a book or one with 10 WTFs / seconds.
What do you think? Does the walrus operator make the Python code more readable and concise?
2024 MLOps learning roadmap
๐ช๐ฎ๐ป๐ to ๐น๐ฒ๐ฎ๐ฟ๐ป ๐ ๐๐ข๐ฝ๐ but got stuck at the 100th tool you think you must know? Here is the ๐ ๐๐ข๐ฝ๐ ๐ฟ๐ผ๐ฎ๐ฑ๐บ๐ฎ๐ฝ ๐ณ๐ผ๐ฟ ๐ฎ๐ฌ๐ฎ๐ฐ โ
๐๐๐๐ฑ๐ด ๐ท๐ด. ๐๐ ๐ฆ๐ฏ๐จ๐ช๐ฏ๐ฆ๐ฆ๐ณ
In theory, MLEs focus on deploying models to production while MLOps engineers build the platform used by MLEs.
I think this is heavily dependent on the scale of the company. As the company gets smaller, these 2 roles start to overlap more.
This roadmap will teach you how to build such a platform, from programming skills to MLOps components and infrastructure as code.
.
Here is the MLOps roadmap for 2024 suggested by
๐ญ. ๐ฃ๐ฟ๐ผ๐ด๐ฟ๐ฎ๐บ๐บ๐ถ๐ป๐ด
- Python & IDEs
- Bash basics & command line editors
๐ฎ. ๐๐ผ๐ป๐๐ฎ๐ถ๐ป๐ฒ๐ฟ๐ถ๐๐ฎ๐๐ถ๐ผ๐ป ๐ฎ๐ป๐ฑ ๐๐๐ฏ๐ฒ๐ฟ๐ป๐ฒ๐๐ฒ๐
- Docker
- Kubernetes
๐ฏ. ๐ ๐ฎ๐ฐ๐ต๐ถ๐ป๐ฒ ๐น๐ฒ๐ฎ๐ฟ๐ป๐ถ๐ป๐ด ๐ณ๐๐ป๐ฑ๐ฎ๐บ๐ฒ๐ป๐๐ฎ๐น๐
...until now we laid down the fundamentals. Now let's get into MLOps ๐ฅ
๐ฐ. ๐ ๐๐ข๐ฝ๐ ๐ฝ๐ฟ๐ถ๐ป๐ฐ๐ถ๐ฝ๐น๐ฒ๐
- reproducible,
- testable, and
- evolvable ML-powered software
๐ฑ. ๐ ๐๐ข๐ฝ๐ ๐ฐ๐ผ๐บ๐ฝ๐ผ๐ป๐ฒ๐ป๐๐
- Version control & CI/CD pipelines
- Orchestration
- Experiment tracking and model registries
- Data lineage and feature stores
- Model training & serving
- Monitoring & observability
๐ฒ. ๐๐ป๐ณ๐ฟ๐ฎ๐๐๐ฟ๐๐ฐ๐๐๐ฟ๐ฒ ๐ฎ๐ ๐ฐ๐ผ๐ฑ๐ฒ
- Terraform
As a self-learner, I wish I had access to this step-by-step plan when I started learning MLOps.
Remember, you should pick up and tailor this roadmap at the level you are currently at.
Find more details about the roadmap in
โ ๐ MLOps roadmap 2024