Qureos

FIND_THE_RIGHTJOB.

AI Engineer - Senior

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

Key Responsibilities1. AI Model Development & Training

  • Build, train, and fine-tune models across:
  • Multilingual NMT & Adaptive Translation models
  • Transliteration models for Indian languages
  • ASR systems (Whisper, NVIDIA NeMo, Indic-ASR)
  • TTS models
  • LLMs & embedding models for RAG
  • Create and manage large multilingual datasets (20+ Indian languages).
  • Perform dataset preprocessing, augmentation, and large-scale training.
  • Benchmark models using BLEU, chrf++, WER, CER, and custom metrics.
  • Convert and optimize models using CTranslate2, Faster-Whisper, ONNX, INT4/INT8 quantization, etc.

2. Model Optimization for Production

  • Reduce model size using pruning and quantization.
  • Optimize inference for real-time workloads.
  • Improve GPU/CPU utilization and memory efficiency.
  • Build scalable inference pipelines for translation, ASR, OCR, and RAG systems.

3. Audio & Video Processing Systems

  • Develop advanced audio transcription & translation pipelines.
  • Implement real-time Indic language STT systems.
  • Create subtitle extraction and SRT translation workflows.
  • Integrate diarization, language detection, summarization, and cross-lingual translation components.

4. RAG & LLM-Based Systems

  • Architect multilingual RAG pipelines.
  • Build vector databases and embedding systems.
  • Implement document parsing, indexing, chunking, and hybrid retrieval.
  • Integrate LLMs (Llama, Gemma, Qwen, etc.) for chatbot/voice-bot solutions.

5. Infrastructure & Server Management

  • Manage AI/ML infrastructure on AWS & GCP (GPU provisioning, tuning).
  • Optimize GPU usage, reduce infra cost, and manage server scheduling.
  • Implement monitoring, auto-restart, logging, and fail-safe systems.
  • Deploy highly available APIs for translation, ASR, OCR, and chatbot services.
  • Troubleshoot cloud GPU environments (NVIDIA drivers, CUDA issues).

6. Cross-Functional Collaboration

  • Work with Sales, Ops, and Tech teams to support enterprise and government clients.
  • Handle escalations, server outages, and critical deployments.
  • Maintain detailed documentation for APIs, model flows, and deployment processes.
  • Build internal tools to streamline workflows and reduce team dependencies.

Required Skills & ExperienceTechnical Skills

  • Strong foundation in NLP, Speech Processing, Deep Learning, and Generative AI.
  • 4–5 years of hands-on experience with production-grade ML/NLP systems.
  • Proficiency in:
  • Python, PyTorch, TensorFlow
  • ASR, TTS, LLMs, and Transformer-based architectures
  • CTranslate2, Faster-Whisper, ONNX Runtime
  • LLM inference frameworks: vLLM, SGLang
  • Quantization techniques (AWQ, INT4/INT8)
  • Vector DBs: FAISS, Pinecone
  • Docker, FastAPI, Linux
  • AWS/GCP GPU Infrastructure
  • Expertise in Indian language NLP and multilingual model building.
  • Experience creating datasets and training models from scratch.

Bonus Skills

  • Knowledge of WebRTC or real-time streaming protocols.
  • Experience with Streamlit/Gradio for AI demos.
  • Familiarity with TTS, voice pipelines, barge-in systems, telephony APIs.
  • Experience with NVIDIA NeMo or similar speech frameworks.

Job Types: Full-time, Permanent

Pay: ₹1,200,000.00 - ₹1,500,000.00 per year

Benefits:

  • Paid time off

Application Question(s):

  • Notice Period

Experience:

  • total work: 4 years (Preferred)
  • AI: 4 years (Preferred)
  • NLP : 4 years (Preferred)

Work Location: In person

© 2025 Qureos. All rights reserved.