Qureos

FIND_THE_RIGHTJOB.

LLM Ops Engineer – Oracle Cloud Infrastructure (OCI) – AI Consulting & Strategy

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

Experience : 5+ Years Experience
Education : Bachelor’s/Master’s degree
Work Location : Chennai, India (Chennai/Remote/Hybrid)

Key Responsibilities :

  • Deploy, scale, and maintain LLM models on OCI OKE clusters with GPU acceleration.
  • Automate model deployment pipelines integrated with OCI DevOps and Terraform.
  • Ensure high availability and disaster recovery using OCI region/AD-based deployment architectures.
  • Implement cost and resource optimization across OCI Compute and Storage resources.
  • Develop observability dashboards using OCI Monitoring and APM for LLM inference pipelines.
  • Collaborate with Data Scientists to operationalize and monitor model performance in OCI environments.
  • Maintain governance, compliance, and OCI security best practices for AI infrastructure.

Required Technical Skills :

OCI Infrastructure & Orchestration

  • Strong experience managing AI workloads on Oracle Kubernetes Engine (OKE).
  • Proficiency in OCI Compute, OCI Block Volumes, and OCI Object Storage for AI data pipelines.
  • Experience with OCI Networking, Load Balancers, and Service Gateway for secure AI deployments.
  • Expertise in OCI Resource Manager (Terraform on OCI) for infrastructure-as-code deployments.

Model Deployment & CI/CD Pipelines

  • Build and automate model deployment pipelines using OCI DevOps, or MLflow.
  • Deploy and manage model inference using OCI Data Science, Triton, or vLLM on GPU-enabled OKE clusters.
  • Integrate OCI Functions for serverless orchestration and model triggering workflows.

Monitoring & Observability

  • Implement observability using OCI Monitoring, Logging, and Application Performance Monitoring (APM).
  • Track AI-specific metrics for latency, token usage, drift detection, and GPU performance.
  • Cloud Resource Optimization:
    • Leverage OCI Cost Analysis and Budget tools for AI resource optimization.
    • Implement GPU auto-scaling and OCI Autoscaling policies for high-performance LLM workloads.
    • Monitor compute and network usage with OCI Observability and Management services.

Data & Model Governance

  • Use OCI Data Catalog and Object Storage lifecycle management for model lineage and governance.
  • Ensure compliance and data protection using OCI Vault for key and secret management.
  • Implement IAM policies and compartment-level security for multi-tenant LLM workloads.

DevOps/MLOps Foundation

  • CI/CD automation using OCI DevOps or GitHub Actions integrated with OCI pipelines.
  • Container security, vulnerability scanning, and image signing using OCI Registry (OCIR).
  • Strong proficiency in Python, Bash, Terraform, and YAML scripting.

Required Skills & Experience :

  • 10+ years’ experience in DevOps, MLOps, or Cloud Infrastructure engineering roles.
  • Hands-on experience managing Kubernetes clusters and GPU workloads on OCI or other major clouds.
  • Strong understanding of OCI core services including OKE, Compute, Networking, and Storage.
  • Experience in AI/ML pipeline deployment using OCI Data Science and OCI DevOps.
  • Proficiency in Terraform for OCI automation and environment provisioning.
  • Deep understanding of cloud-native architectures and observability patterns.

APPLY

Close


Drivestream’s Employee Benefits.

Remuneration

Drivestream offers competitive pay and attracts a diverse community of skilled individuals. We recognize the value of investing in our talent.

Medical, Disability and Life Insurance

We provide an array of coverage options including full medical, full dental and vision plans, employee life insurance, LTD and STD coverage, flexible spending account and employee accidental death and dismemberment

Leave Benefits

Drivestream’s generous paid leave programs feature vacation/paid time off (PTO), holiday leave and bereavement leave

Professional Development

Our training and development programs include traditional classroom training, online courses, including leadership, communication and project planning development, strategic planning and management programs, and professional society membership incentive

Work/Life Programs

Drivestream offers Work-Life Integration options that help individuals manage their personal and professional responsibilities. Options Include work from home and telecommuting, day care, flexible spending accounts, internal job transfer, and career mobility, and health and wellness programs

Community Involvement

Drivestream believes in supporting community and philanthropic activities that allow our employees to engage in outreach and educational programs.

Awards Programs

Drivestream recognizes and rewards our staff through various annual awards programs.

Retirement Benefits

Drivestream offers complete 401(k) plans and annual profit-sharing contribution.

© 2025 Qureos. All rights reserved.