Speech Data scientist

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

This role is for one of the Weekday's clients

Salary range: Rs 2400000 - Rs 3500000 (ie INR 24-35 LPA)

Min Experience: 6 years

Location: Bangalore

JobType: full-time

We are seeking a skilled Speech Data Scientist to design, develop, and optimize advanced speech analytics and automatic speech recognition (ASR) solutions. The ideal candidate will work on end-to-end speech pipelines, multilingual audio processing, and model deployment in production environments. You will also drive research and innovation in speech processing, contributing to model enhancement and high-impact technical solutions.

Requirements

Key Responsibilities

Core Development & Implementation

Design and implement end-to-end speech analytics pipelines for production.
Develop ASR engines using frameworks such as Wav2vec, Whisper, and Deep Speech with PyTorch or TensorFlow.
Build and optimize speaker diarization, language identification (LID), and text post-processing systems.
Focus on multilingual audio processing and domain adaptation strategies.
Lead data selection and preprocessing for improved model performance.

Model Development & Enhancement

Develop and analyze objective measures for speech quality evaluation and enhancement.
Implement speaker-conditioned personalization techniques to improve ASR accuracy in noisy environments.
Optimize on-device ASR models, emphasizing multi-language scenarios.
Guide teams on best practices for model accuracy and performance optimization.

Research & Innovation

Conduct research on advanced speech processing and neural speech enhancement techniques.
Develop novel solutions for multi-speaker and complex audio scenarios.
Contribute to patents, publications, and technical thought leadership in speech technology.
Stay updated on transformer models, attention mechanisms, and foundation models.

Technical Integration & Deployment

Design integration architectures for speech-to-text services and related technologies.
Implement MLOps processes and CI/CD pipelines for speech model deployment.
Deploy and scale speech solutions on cloud platforms (AWS, GCP).
Develop production-ready applications using Python, C++, and Java.

Required Qualifications

Education

Ph.D./M.S./M.Tech in Computer Science, Signal Processing, or related field preferred.
B.Tech/B.E in ECE, CSE, or related technical field required.

Technical Expertise

Speech Processing: 3-6 years of hands-on experience in ASR and speech analytics. Strong knowledge of HMMs, GMMs, ANNs, language modeling, CNNs, RNNs, LSTMs, CTC, and attention mechanisms.
Machine Learning / Deep Learning: Proficiency in PyTorch and TensorFlow; experience with transformer models (BERT, Wav2vec 2.0, Whisper) and end-to-end ASR implementation.
Programming & Tools: Strong Python skills (numpy, pandas, scikit-learn), experience with C++/Java for production, bash scripting, and Git.
Cloud & Deployment: Hands-on experience with AWS/GCP, containerization (Docker, Kubernetes), MLOps, CI/CD pipelines, and scalable model serving.

Skills

ASR, Speech Recognition, Speech Analytics, Multilingual Audio Processing, Python, PyTorch, TensorFlow, Deep Learning

Similar jobs

No similar jobs found

Term of use Privacy policy