
Join us for a deep technical dive at ML OpenTalk November 2025: Speeding up training with FP8 and Triton
Training Large Language Models requires a lot of compute. To mitigate the GPU cost and speed up both the compute and communication, research labs have started exploring training in lower precision. FP8 has the potential of speeding up a kernel by up to 2x, but may introduce non-trivial quality degradation if done wrong.
In this talk, Vlad from the YandexGPT pretraining team will provide an overview of recent papers on FP8 training and related open source software and share the insights behind production-scale FP8 pretraining. Along the way, we will cover how GPUs work and implement Triton kernels to speed up computations.
You’ll learn:
Agenda
November 13, 7:00 PM
Yandex Hall, Yerevan
Language: English
Link for registration to ML OpenTalk November 2025: https://forms.yandex.ru/surveys/13794306.bdd0707f942d9c6592938b40d89b1b6a5d672933
You can find more tech events happening in Armenia here