Employment Information
Dive into NVIDIA's realm, contributing to AI and deep learning, shaping global technology in various fields. Participate in the Speech Data Processing team, pushing boundaries in speech and language AI. This role offers an outstanding chance to work with NVIDIA's NeMo framework, developing groundbreaking tools.
What you'll be doing:
-
Engage in improving and perfecting a sophisticated speech data normalization and alignment tool with the assistance of NVIDIA's NeMo framework.
-
Apply techniques in text normalization and audio-text alignment to prepare large-scale datasets for advanced speech processing tasks.
-
Ensure robust handling of input audio and textual data to deliver highly accurate, automated spoken text outputs.
-
Collaborate with multidisciplinary teams to translate requirements into practical tool building and implementation.
-
Conduct comprehensive testing and validation to meet exacting internal quality standards.
-
Assist in the deployment of modern advancements geared towards improving NVIDIA's speech and language AI technologies.
What we need to see:
-
Current enrollment in a Bachelor's, Master's, or PhD program in Computer Science, Engineering, or related field.
-
Strong programming experience in Python (or a comparable language).
-
Familiarity with ML frameworks/libraries (e.g., PyTorch or TensorFlow).
-
Foundational knowledge of NLP and speech recognition concepts.
-
Collaborative, inclusive approach with strong problem-solving skills and curiosity.
Ways to stand out from the crowd:
-
Hands-on experience with NVIDIA's NeMo toolkit or similar platforms for speech and language processing.
-
Previous internships or substantial project experience in data engineering or machine learning for speech applications.
-
Direct knowledge of techniques for audio signal processing and text normalization.

