FIND_THE_RIGHTJOB.

Sublime Wireless Inc

AI Speech Engineer – Custom TTS

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

AI Speech Engineer – Custom TTS (On-Device, Multilingual)

About the Role

We are looking for an experienced AI Speech Engineer with deep expertise in fine-tuning open-source TTS engines using custom voice datasets. The role requires the ability to build high-quality, multilingual, expressive TTS voices (with cues, non-verbal expressions, and prosody variations), and to optimize them for fully offline use on mobile devices.

You will be responsible for creating a world-class TTS pipeline: from dataset preparation → fine-tuning → evaluation → on-device deployment (Android/iOS).

Responsibilities

Fine-tune open-source TTS models (e.g., VITS, Glow-TTS, Tacotron2, FastSpeech2, Coqui TTS, Fairseq S2T, Bark-like models).
Build custom voices from multi-actor recordings, including prosody cues, NVEs (non-verbal expressions), laughter, sighs, whispers, emphasis, and emotional tones.
Format and preprocess multilingual datasets (e.g., English, Urdu, Arabic, Hindi).
Implement voice cloning and speaker adaptation methods (speaker embeddings, x-vectors, HuBERT/ContentVec conditioning).
Apply latest fine-tuning techniques (gradual unfreezing, adversarial training, vocoder adaptation, multi-speaker conditioning).
Deploy optimized TTS engines on-device using ONNX Runtime, TensorFlow Lite, Core ML, or custom inference runtimes.
Optimize for low-latency, low-memory, and battery-efficient speech generation on mobile CPUs/NPUs/GPUs.
Evaluate output quality (MOS, prosody accuracy, multilingual pronunciation consistency).
Collaborate with engineers to integrate TTS modules into apps for real-time, offline speech synthesis.

Mandatory Skills Checklist (Applicants must demonstrate experience in ALL of the following)

✅ TTS Model Fine-Tuning

Hands-on fine-tuning of open-source TTS engines (Coqui TTS, VITS, Glow-TTS, Tacotron, FastSpeech).
Building multilingual and multi-speaker models.
Dataset alignment: phoneme extraction, grapheme-to-phoneme (G2P), forced alignment (Montreal Forced Aligner, MFA, or equivalent).
Handling prosody, cues, and NVEs in dataset labeling.

✅ Voice Dataset Engineering

Preparing raw actor recordings → cleaned, labeled dataset.
Handling multilingual phoneme sets (IPA, G2P for Urdu, Arabic, Hindi, English).
Speaker embedding extraction (d-vectors, x-vectors, ECAPA, HuBERT units).
Noise reduction, augmentation, silence trimming, forced alignment.

✅ On-Device Deployment

Exporting TTS models to ONNX/TFLite/Core ML.
Running inference with optimized vocoders (HiFi-GAN, WaveGlow, Parallel WaveGAN).
Experience with quantization/pruning of speech models for mobile.
Benchmarking real-time inference: latency, RAM usage, and energy efficiency.

✅ Latest Techniques Knowledge

Expressive/controllable TTS (prosody embeddings, style tokens, GST, variational prosody models).
Speaker adaptation & cross-lingual voice transfer.
Handling low-resource languages (Urdu, Arabic).
Evaluation frameworks (MOS testing, AB preference tests, WER for intelligibility).

Nice to Have

Contributions to open-source TTS projects (Coqui, ESPnet, Fairseq, Bark, etc.).
Experience with speech-to-speech systems or multimodal pipelines.
Familiarity with distillation/quantization of TTS models for edge devices.
Worked on custom vocoder design for emotional or non-verbal cues.

Application Requirements

Applicants must include:

A short case study of a TTS model they fine-tuned (dataset type, model used, output samples).
A short case study of deploying a TTS model on-device (framework, device, latency, memory usage).
Links to audio samples, demos, GitHub repos, or production apps showing custom voices.

Job Type: Full-time

Pay: Rs250,000.00 - Rs400,000.00 per month

Work Location: In person

Similar jobs

Python Engineer

Contrive Solutions

Lahore, Pakistan

5 days ago

Data Scientist / Developer

Gigalabs (Pvt) Ltd.

Lahore, Pakistan

5 days ago

Data Scientist / Developer

Gigalabs (Pvt) Ltd.

Lahore, Pakistan

5 days ago

Computer Vision Engineer

Motive

Lahore, Pakistan

5 days ago

AI Expert for Software Solutions

Mohkaab Enterprise

Lahore, Pakistan

5 days ago

AI Engineer (PhD Required)

Pixalate, Inc.

Lahore, Pakistan

5 days ago

AI Automation Engineer

My Digital People

Lahore, Pakistan

5 days ago

Term of use Privacy policy