Are you ready for the next PyData Yerevan November 2023 meetup?
If you have ever thought about how to prepare your own speech dataset with the help of Internet data, then we have amazing news for you!
Join us, as Nikolay Karpov, a Senior Research Scientist in the fields of speech recognition and NLP at NVIDIA NeMo, will discover “How to prepare a speech dataset and minimize the amount of boilerplate code required?”
Processing a lot of data for training neural models requires more effort than neural network engineering and training. Nvidia NeMo team has made a Speech Data Processor tool to simplify the process: https://github.com/NVIDIA/NeMo-speech-data-processor
During the talk, you will explore the steps for speech dataset preparation, including:
Video-to-audio conversion
Metadata parsing
Audio and text language identification
Speech recognition
Text normalization
Filtration by metrics and regular expression
Hurry up to register and attend the talk on November 16, at 19:00, in the PMI Science R&D Center in Armenia (Teryan 105, 13 building): https://forms.gle/3S9pg4imAyCsM8u66
Join PyData Yerevan’s Telegram channel for more updates: https://t.me/pydatayerevan
Big thanks to PyData Yerevan meetups’ sponsor, PMI Science R&D Center in Armenia, and our supporting organizers, Datamotus and Akian College of Science and Engineering.
Find more tech events happening in Armenia here