Qureos

FIND_THE_RIGHTJOB.

Lead Engineer (AI/ML - Inferencing)

India

Overview:

Join the Prodapt team in building a unified, cloud-native environment for scalable machine learning inferencing. You will help design, develop, and optimize robust workflows that empower data scientists and engineers to efficiently deploy, serve, and monitor ML models at scale, supporting both real-time and batch inference use cases.

Responsibilities:
  • Develop, maintain, and enhance model deployment workflows using Seldon, Docker, and Kubernetes for scalable inferencing.
  • Build and optimize REST and gRPC endpoints for serving ML models, ensuring secure and reliable access from internal tools and pipelines.
  • Integrate with AI Hub for unified model registration, endpoint management, and monitoring.
  • Support both online (real-time) and offline (batch) inference, leveraging batch inference capabilities.
  • Manage container images and model artifacts using Docker Hub, Artifact Registry, and Google Cloud Storage.
  • Implement and maintain CI/CD pipelines for automated model deployment and endpoint promotion.
  • Ensure robust security, compliance, and governance, including role-based access control and audit logging.
  • Collaborate with data scientists, ML engineers, and platform teams to deliver production-grade inferencing solutions.
  • Participate in code reviews, architecture discussions, and continuous improvement of the inferencing platform.
Requirements:

Required Technical Skills

  • Proficiency in Python for ML model development and deployment.
  • Experience with Seldon Core for model serving and endpoint management.
  • Hands-on experience with Docker for containerization and Kubernetes for orchestration.
  • Familiarity with REST and gRPC API development for model serving.
  • Experience with cloud platforms (GCP preferred), including Artifact Registry and Google Cloud Storage.
  • Strong understanding of CI/CD tools and automation for ML workflows.
  • Knowledge of model governance, security best practices, and compliance requirements.
  • Excellent troubleshooting, debugging, and communication skills.

Preferred Qualifications

  • Experience with batch inference frameworks and large-scale batch scoring.
  • Familiarity with large-scale financial/ML platforms.
  • Exposure to agentic frameworks, custom application deployment (BYOA), and orchestration SDKs.
  • Experience with monitoring, dashboarding, and operational health tools for inferencing endpoints.
  • Knowledge of data privacy, vulnerability scanning, and compliance automation in ML environments.

© 2025 Qureos. All rights reserved.