Key Responsibilities1. AI Model Development & Training
- Build, train, and fine-tune models across:
- Multilingual NMT & Adaptive Translation models
- Transliteration models for Indian languages
- ASR systems (Whisper, NVIDIA NeMo, Indic-ASR)
- TTS models
- LLMs & embedding models for RAG
- Create and manage large multilingual datasets (20+ Indian languages).
- Perform dataset preprocessing, augmentation, and large-scale training.
- Benchmark models using BLEU, chrf++, WER, CER, and custom metrics.
- Convert and optimize models using CTranslate2, Faster-Whisper, ONNX, INT4/INT8 quantization, etc.
2. Model Optimization for Production
- Reduce model size using pruning and quantization.
- Optimize inference for real-time workloads.
- Improve GPU/CPU utilization and memory efficiency.
- Build scalable inference pipelines for translation, ASR, OCR, and RAG systems.
3. Audio & Video Processing Systems
- Develop advanced audio transcription & translation pipelines.
- Implement real-time Indic language STT systems.
- Create subtitle extraction and SRT translation workflows.
- Integrate diarization, language detection, summarization, and cross-lingual translation components.
4. RAG & LLM-Based Systems
- Architect multilingual RAG pipelines.
- Build vector databases and embedding systems.
- Implement document parsing, indexing, chunking, and hybrid retrieval.
- Integrate LLMs (Llama, Gemma, Qwen, etc.) for chatbot/voice-bot solutions.
5. Infrastructure & Server Management
- Manage AI/ML infrastructure on AWS & GCP (GPU provisioning, tuning).
- Optimize GPU usage, reduce infra cost, and manage server scheduling.
- Implement monitoring, auto-restart, logging, and fail-safe systems.
- Deploy highly available APIs for translation, ASR, OCR, and chatbot services.
- Troubleshoot cloud GPU environments (NVIDIA drivers, CUDA issues).
6. Cross-Functional Collaboration
- Work with Sales, Ops, and Tech teams to support enterprise and government clients.
- Handle escalations, server outages, and critical deployments.
- Maintain detailed documentation for APIs, model flows, and deployment processes.
- Build internal tools to streamline workflows and reduce team dependencies.
Required Skills & ExperienceTechnical Skills
- Strong foundation in NLP, Speech Processing, Deep Learning, and Generative AI.
- 4–5 years of hands-on experience with production-grade ML/NLP systems.
- Proficiency in:
- Python, PyTorch, TensorFlow
- ASR, TTS, LLMs, and Transformer-based architectures
- CTranslate2, Faster-Whisper, ONNX Runtime
- LLM inference frameworks: vLLM, SGLang
- Quantization techniques (AWQ, INT4/INT8)
- Vector DBs: FAISS, Pinecone
- Docker, FastAPI, Linux
- AWS/GCP GPU Infrastructure
- Expertise in Indian language NLP and multilingual model building.
- Experience creating datasets and training models from scratch.
Bonus Skills
- Knowledge of WebRTC or real-time streaming protocols.
- Experience with Streamlit/Gradio for AI demos.
- Familiarity with TTS, voice pipelines, barge-in systems, telephony APIs.
- Experience with NVIDIA NeMo or similar speech frameworks.
Job Types: Full-time, Permanent
Pay: ₹1,200,000.00 - ₹1,500,000.00 per year
Benefits:
Application Question(s):
Experience:
- total work: 4 years (Preferred)
- AI: 4 years (Preferred)
- NLP : 4 years (Preferred)
Work Location: In person