The new king of Infrastructure as Code (IaC)
Monitoring your DL models while in production. How to build a scalable data collection pipeline
Decoding ML Notes
This weekโs topics:
The new king of Infrastructure as Code (IaC)
How to build a scalable data collection pipeline
Monitoring your DL models while in production
The new king of Infrastructure as Code (IaC)
This is ๐๐ต๐ฒ ๐ป๐ฒ๐ ๐ธ๐ถ๐ป๐ด ๐ผ๐ณ ๐๐ป๐ณ๐ฟ๐ฎ๐๐๐ฟ๐๐ฐ๐๐๐ฟ๐ฒ ๐ฎ๐ ๐๐ผ๐ฑ๐ฒ (๐๐ฎ๐). Here is ๐๐ต๐ it is ๐ฏ๐ฒ๐๐๐ฒ๐ฟ than ๐ง๐ฒ๐ฟ๐ฟ๐ฎ๐ณ๐ผ๐ฟ๐บ or ๐๐๐ โ
โ I am talking about Pulumi โ
Let's see what is made of
โโโ
๐ช๐ต๐ฎ๐ ๐ถ๐ ๐ฃ๐๐น๐๐บ๐ถ ๐ฎ๐ป๐ฑ ๐ต๐ผ๐ ๐ถ๐ ๐ถ๐ ๐ฑ๐ถ๐ณ๐ณ๐ฒ๐ฟ๐ฒ๐ป๐?
Unlike other IaC tools that use YAML, JSON, or a Domain-Specific Language (DSL), Pulumi lets you write code in languages like Python, TypeScript, Node.js, etc.
- This enables you to leverage existing programming knowledge and tooling for IaC tasks.
- Pulumi integrates with familiar testing libraries for unit and integration testing of your infrastructure code.
- It integrates with most cloud providers (AWS, GCP, Azure, Oracle, etc.)
๐๐ฒ๐ป๐ฒ๐ณ๐ถ๐๐ ๐ผ๐ณ ๐๐๐ถ๐ป๐ด ๐ฃ๐๐น๐๐บ๐ถ:
๐๐น๐ฒ๐
๐ถ๐ฏ๐ถ๐น๐ถ๐๐: Use your preferred programming language for IaC + it works for most clouds out there
๐๐ณ๐ณ๐ถ๐ฐ๐ถ๐ฒ๐ป๐ฐ๐: Leverage existing programming skills and tooling.
๐ง๐ฒ๐๐๐ฎ๐ฏ๐ถ๐น๐ถ๐๐: Write unit and integration tests for your infrastructure code.
๐๐ผ๐น๐น๐ฎ๐ฏ๐ผ๐ฟ๐ฎ๐๐ถ๐ผ๐ป: Enables Dev and Ops to work together using the same language.
If you disagree, try to apply OOP or logic (if, for statements) to Terraform HCL's syntax.
It works, but it quickly becomes a living hell.
๐๐ผ๐ ๐ฃ๐๐น๐๐บ๐ถ ๐๐ผ๐ฟ๐ธ๐:
- Pulumi uses a declarative approach. You define the desired state of your infrastructure.
- It manages the state of your infrastructure using a state file.
- When changes are made to the code, Pulumi compares the desired state with the current state and creates a plan to achieve the desired state.
- The plan shows what resources will be created, updated, or deleted.
- You can review and confirm the plan before Pulumi executes it.
โ It works similarly to Terraform but with all the benefits your favorite programming language and existing tooling provides
โ It works similar to CDK, but faster and for your favorite cloud infrastructure (not only AWS)
What do you think? Have you used Pulumi?
We started using it for the LLM Twin course, and so far, we love it! I will probably wholly migrate from Terraform to Pulumi in future projects.
๐ More on Pulumi
How to build a scalable data collection pipeline
๐๐๐ถ๐น๐ฑ, ๐ฑ๐ฒ๐ฝ๐น๐ผ๐ to ๐๐ช๐ฆ, ๐๐ฎ๐, and ๐๐/๐๐ for a ๐ฑ๐ฎ๐๐ฎ ๐ฐ๐ผ๐น๐น๐ฒ๐ฐ๐๐ถ๐ผ๐ป ๐ฝ๐ถ๐ฝ๐ฒ๐น๐ถ๐ป๐ฒ that ๐ฐ๐ฟ๐ฎ๐๐น๐ your ๐ฑ๐ถ๐ด๐ถ๐๐ฎ๐น ๐ฑ๐ฎ๐๐ฎ โ ๐ช๐ต๐ฎ๐ do you need ๐ค
๐ง๐ต๐ฒ ๐ฒ๐ป๐ฑ ๐ด๐ผ๐ฎ๐น?
๐ ๐ด๐ค๐ข๐ญ๐ข๐ฃ๐ญ๐ฆ ๐ฅ๐ข๐ต๐ข ๐ฑ๐ช๐ฑ๐ฆ๐ญ๐ช๐ฏ๐ฆ ๐ต๐ฉ๐ข๐ต ๐ค๐ณ๐ข๐ธ๐ญ๐ด, ๐ค๐ฐ๐ญ๐ญ๐ฆ๐ค๐ต๐ด, ๐ข๐ฏ๐ฅ ๐ด๐ต๐ฐ๐ณ๐ฆ๐ด ๐ข๐ญ๐ญ ๐บ๐ฐ๐ถ๐ณ ๐ฅ๐ช๐จ๐ช๐ต๐ข๐ญ ๐ฅ๐ข๐ต๐ข ๐ง๐ณ๐ฐ๐ฎ:
- LinkedIn
- Medium
- Substack
- Github
๐ง๐ผ ๐ฏ๐๐ถ๐น๐ฑ ๐ถ๐ - ๐ต๐ฒ๐ฟ๐ฒ ๐ถ๐ ๐๐ต๐ฎ๐ ๐๐ผ๐ ๐ป๐ฒ๐ฒ๐ฑ โ
๐ญ. ๐ฆ๐ฒ๐น๐ฒ๐ป๐ถ๐๐บ: a Python tool for automating web browsers. Itโs used here to interact with web pages programmatically (like logging into LinkedIn, navigating through profiles, etc.)
๐ฎ. ๐๐ฒ๐ฎ๐๐๐ถ๐ณ๐๐น๐ฆ๐ผ๐๐ฝ: a Python library for parsing HTML and XML documents. It creates parse trees that help us extract the data quickly.
๐ฏ. ๐ ๐ผ๐ป๐ด๐ผ๐๐ (๐ผ๐ฟ ๐ฎ๐ป๐ ๐ผ๐๐ต๐ฒ๐ฟ ๐ก๐ผ๐ฆ๐ค๐ ๐๐): a NoSQL database fits like a glove on our unstructured text data
๐ฐ. ๐๐ป ๐ข๐๐ : a technique that maps between an object model in an application and a document database
๐ฑ. ๐๐ผ๐ฐ๐ธ๐ฒ๐ฟ & ๐๐ช๐ฆ ๐๐๐ฅ: to deploy our code, we have to containerize it, build an image for every change of the main branch, and push it to AWS ECR
๐ฒ. ๐๐ช๐ฆ ๐๐ฎ๐บ๐ฏ๐ฑ๐ฎ: we will deploy our Docker image to AWS Lambda - a serverless computing service that allows you to run code without provisioning or managing servers. It executes your code only when needed and scales automatically, from a few daily requests to thousands per second
๐ณ. ๐ฃ๐๐น๐๐บ๐ป๐ถ: IaC tool used to programmatically create the AWS infrastructure: MongoDB instance, ECR, Lambdas and the VPC
๐ด. ๐๐ถ๐๐๐๐ฏ ๐๐ฐ๐๐ถ๐ผ๐ป๐: used to build our CI/CD pipeline - on any merged PR to the main branch, it will build & push a new Docker image and deploy it to the AWS Lambda service
๐พ๐ช๐ง๐๐ค๐ช๐จ ๐๐ค๐ฌ ๐ฉ๐๐๐จ๐ ๐ฉ๐ค๐ค๐ก๐จ ๐ฌ๐ค๐ง๐ ๐ฉ๐ค๐๐๐ฉ๐๐๐ง?
Then...
โโโ
Check out ๐๐ฒ๐๐๐ผ๐ป ๐ฎ from the FREE ๐๐๐ ๐ง๐๐ถ๐ป ๐๐ผ๐๐ฟ๐๐ฒ created by Decoding ML
...where we will walk you ๐๐๐ฒ๐ฝ-๐ฏ๐-๐๐๐ฒ๐ฝ through the ๐ฎ๐ฟ๐ฐ๐ต๐ถ๐๐ฒ๐ฐ๐๐๐ฟ๐ฒ and ๐ฐ๐ผ๐ฑ๐ฒ of the ๐ฑ๐ฎ๐๐ฎ ๐ฝ๐ถ๐ฝ๐ฒ๐น๐ถ๐ป๐ฒ:๐ The Importance of Data Pipelines in the Era of Generative AI
Monitoring your DL models while in production
๐ ๐ผ๐ป๐ถ๐๐ผ๐ฟ๐ถ๐ป๐ด is ๐ง๐๐ ๐ธ๐ฒ๐ ๐ ๐๐ข๐ฝ๐ ๐ฒ๐น๐ฒ๐บ๐ฒ๐ป๐ in ensuring your ๐บ๐ผ๐ฑ๐ฒ๐น๐ in ๐ฝ๐ฟ๐ผ๐ฑ๐๐ฐ๐๐ถ๐ผ๐ป are ๐ณ๐ฎ๐ถ๐น-๐๐ฎ๐ณ๐ฒ. Here is an ๐ฎ๐ฟ๐๐ถ๐ฐ๐น๐ฒ on ๐ ๐ ๐บ๐ผ๐ป๐ถ๐๐ผ๐ฟ๐ถ๐ป๐ด using Triton, Prometheus and Grafana โ
Within his article, he started with an example where, in one of his projects, a main processing task was supposed to take <5 ๐ฉ๐ฐ๐ถ๐ณ๐ด, but while in production, it jumped to >8 ๐ฉ๐ฐ๐ถ๐ณ๐ด.
โ ๐๐ฉ๐ช๐ด (๐ฐ๐ณ ๐ด๐ฐ๐ฎ๐ฆ๐ต๐ฉ๐ช๐ฏ๐จ ๐ด๐ช๐ฎ๐ช๐ญ๐ข๐ณ) ๐ธ๐ช๐ญ๐ญ ๐ฉ๐ข๐ฑ๐ฑ๐ฆ๐ฏ ๐ต๐ฐ ๐ข๐ญ๐ญ ๐ฐ๐ง ๐ถ๐ด.
Even to the greatest.
It's impossible always to anticipate everything that will happen in production (sometimes it is a waste of time even to try to).
That is why you always need eyes and years on your production ML system.
Otherwise, imagine how much $$$ or users he would have lost if he hadn't detected the ~3-4 hours loss in performance as fast as possible.
Afterward, he explained step-by-step how to use:
- ๐ฐ๐๐ฑ๐๐ถ๐๐ผ๐ฟ to scrape RAM/CPU usage per container
- ๐ง๐ฟ๐ถ๐๐ผ๐ป ๐๐ป๐ณ๐ฒ๐ฟ๐ฒ๐ป๐ฐ๐ฒ ๐ฆ๐ฒ๐ฟ๐๐ฒ๐ฟ to serve ML models and yield GPU-specific metrics.
- ๐ฃ๐ฟ๐ผ๐บ๐ฒ๐๐ต๐ฒ๐๐ to bind between the metrics generators and the consumer.
- ๐๐ฟ๐ฎ๐ณ๐ฎ๐ป๐ฎ to visualize the metrics
๐๐ต๐ฒ๐ฐ๐ธ ๐ถ๐ ๐ผ๐๐ ๐ผ๐ป ๐๐ฒ๐ฐ๐ผ๐ฑ๐ถ๐ป๐ด ๐ ๐
โโโ
๐ How to ensure your models are fail-safe in production?
Images
If not otherwise stated, all images are created by the author.