Learn an end-to-end framework for production-ready LLM systems by building your LLM twin

Why you should take our new production-ready LLMs course

Mar 16, 2024

Decoding ML Notes

Want to 𝗹𝗲𝗮𝗿𝗻 an 𝗲𝗻𝗱-𝘁𝗼-𝗲𝗻𝗱 𝗳𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 for 𝗽𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻-𝗿𝗲𝗮𝗱𝘆 𝗟𝗟𝗠 𝘀𝘆𝘀𝘁𝗲𝗺𝘀 by 𝗯𝘂𝗶𝗹𝗱𝗶𝗻𝗴 your 𝗟𝗟𝗠 𝘁𝘄𝗶𝗻?

Then you are in luck.

↓↓↓

The Decoding ML team and I will 𝗿𝗲𝗹𝗲𝗮𝘀𝗲 (in a few days) a 𝗙𝗥𝗘𝗘 𝗰𝗼𝘂𝗿𝘀𝗲 called the 𝗟𝗟𝗠 𝗧𝘄𝗶𝗻: 𝗕𝘂𝗶𝗹𝗱𝗶𝗻𝗴 𝗬𝗼𝘂𝗿 𝗣𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻-𝗥𝗲𝗮𝗱𝘆 𝗔𝗜 𝗥𝗲𝗽𝗹𝗶𝗰𝗮.

𝗪𝗵𝗮𝘁 𝗶𝘀 𝗮𝗻 𝗟𝗟𝗠 𝗧𝘄𝗶𝗻? It is an AI character that learns to write like somebody by incorporating its style and personality into an LLM.

Within the course, you will learn how to:
architect
train
deploy
...a 𝗽𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻-𝗿𝗲𝗮𝗱𝘆 𝗟𝗟𝗠 𝘁𝘄𝗶𝗻 of yourself powered by LLMs, vector DBs, and LLMOps good practices, such as:
experiment trackers
model registries
prompt monitoring
versioning
deploying LLMs
...and more!

It is an 𝗲𝗻𝗱-𝘁𝗼-𝗲𝗻𝗱 𝗟𝗟𝗠 𝗰𝗼𝘂𝗿𝘀𝗲 where you will 𝗯𝘂𝗶𝗹𝗱 a 𝗿𝗲𝗮𝗹-𝘄𝗼𝗿𝗹𝗱 𝗟𝗟𝗠 𝘀𝘆𝘀𝘁𝗲𝗺:

→ from start to finish

→ from data collection to deployment

→ production-ready

→ from NO MLOps to experiment trackers, model registries, prompt monitoring, and versioning

Who is this for?

Audience: MLE, DE, DS, or SWE who want to learn to engineer production-ready LLM systems using LLMOps good principles.

Level: intermediate

Prerequisites: basic knowledge of Python, ML, and the cloud

How will you learn?

The course contains 11 hands-on written lessons and the open-source code you can access on GitHub (WIP).

You can read everything at your own pace.

Costs?

The articles and code are completely free. They will always remain free.

This time, the Medium articles won't be under any paid wall. I want to make them entirely available to everyone.

Meet your teachers!

The course is created under the Decoding ML umbrella by:

Paul Iusztin | Senior ML & MLOps Engineer
Alex Vesa | Senior AI Engineer
Alex Razvant | Senior ML & MLOps Engineer

What will you learn to build?

LM twin system architecture [Image by the Author]

🐍 𝘛𝘩𝘦 𝘓𝘓𝘔 𝘢𝘳𝘤𝘩𝘪𝘵𝘦𝘤𝘵𝘶𝘳𝘦 𝘰𝘧 𝘵𝘩𝘦 𝘤𝘰𝘶𝘳𝘴𝘦 𝘪𝘴 𝘴𝘱𝘭𝘪𝘵 𝘪𝘯𝘵𝘰 4 𝘗𝘺𝘵𝘩𝘰𝘯 𝘮𝘪𝘤𝘳𝘰𝘴𝘦𝘳𝘷𝘪𝘤𝘦𝘴:

𝗧𝗵𝗲 𝗱𝗮𝘁𝗮 𝗰𝗼𝗹𝗹𝗲𝗰𝘁𝗶𝗼𝗻 𝗽𝗶𝗽𝗲𝗹𝗶𝗻𝗲

- Crawl your digital data from various social media platforms.

- Clean, normalize and load the data to a NoSQL DB through a series of ETL pipelines.

- Send database changes to a queue using the CDC pattern.

☁ Deployed on AWS.

𝗧𝗵𝗲 𝗳𝗲𝗮𝘁𝘂𝗿𝗲 𝗽𝗶𝗽𝗲𝗹𝗶𝗻𝗲

- Consume messages from a queue through a Bytewax streaming pipeline.

- Every message will be cleaned, chunked, embedded (using Superlinked), and loaded into a Qdrant vector DB in real-time.

☁ Deployed on AWS.

𝗧𝗵𝗲 𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴 𝗽𝗶𝗽𝗲𝗹𝗶𝗻𝗲

- Create a custom dataset based on your digital data.

- Fine-tune an LLM using QLoRA.

- Use Comet ML's experiment tracker to monitor the experiments.

- Evaluate and save the best model to Comet's model registry.

☁ Deployed on Qwak.

𝗧𝗵𝗲 𝗶𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗽𝗶𝗽𝗲𝗹𝗶𝗻𝗲

- Load and quantize the fine-tuned LLM from Comet's model registry.

- Deploy it as a REST API.

- Enhance the prompts using RAG.

- Generate content using your LLM twin.

- Monitor the LLM using Comet's prompt monitoring dashboard .

☁ Deployed on Qwak.

𝘈𝘭𝘰𝘯𝘨 𝘵𝘩𝘦 4 𝘮𝘪𝘤𝘳𝘰𝘴𝘦𝘳𝘷𝘪𝘤𝘦𝘴, 𝘺𝘰𝘶 𝘸𝘪𝘭𝘭 𝘭𝘦𝘢𝘳𝘯 𝘵𝘰 𝘪𝘯𝘵𝘦𝘨𝘳𝘢𝘵𝘦 3 𝘴𝘦𝘳𝘷𝘦𝘳𝘭𝘦𝘴𝘴 𝘵𝘰𝘰𝘭𝘴:

- Comet ML as your ML Platform

- Qdrant as your vector DB

- Qwak as your ML infrastructure

Soon, we will release the first lesson from the 𝗟𝗟𝗠 𝗧𝘄𝗶𝗻: 𝗕𝘂𝗶𝗹𝗱𝗶𝗻𝗴 𝗬𝗼𝘂𝗿 𝗣𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻-𝗥𝗲𝗮𝗱𝘆 𝗔𝗜 𝗥𝗲𝗽𝗹𝗶𝗰𝗮

To stay updated...

𝘾𝙝𝙚𝙘𝙠 𝙞𝙩 𝙤𝙪𝙩 𝙂𝙞𝙩𝙃𝙪𝙗 𝙖𝙣𝙙 𝙨𝙪𝙥𝙥𝙤𝙧𝙩 𝙪𝙨 𝙬𝙞𝙩𝙝 𝙖 ⭐️

↓↓↓

🔗 LLM Twin: Building Your Production-Ready AI Replica Course GitHub Repository