Fine-tuning large language models (LLMs) often requires substantial resources, yet recent advancements allow for effective training on a single GPU.
This practical, 3.5-hour course introduces industry professionals — from start-ups to SMEs and large enterprises — to efficient techniques for fine-tuning LLMs without sacrificing model performance.
In this course, participants will learn about:
- Quantization: Techniques to reduce model memory requirements, enabling the deployment of LLMs on resource-constrained hardware.
- Parameter-Efficient Fine-Tuning (PEFT) with LoRA: Fine-tune large models by adjusting a limited set of parameters, reducing computational load while achieving high-quality results.
Through hands-on exercises and guided demos, participants will gain practical experience in fine-tuning LLMs using these methods, setting the stage for immediate application in real-world projects, from question answering chat bots in an online shop to digital layers interpreting the national law in the context of a specific case. The possibilities are endless.
By the end of this course, participants will have developed a foundational understanding of techniques reducing the memory footprint of a large language model during fine-tuning and inference.