Role Overview
We are looking for a
Senior Technology Engineer
to drive
platform stability, automation, and operational excellence
within the Data Science Platform (DSP).
This is not a support role — this is a
hands-on engineering role
where you own automation, orchestration, and reliability across a
hybrid cloud ecosystem (OpenShift + AWS/Azure/GCP)
.
You will be the backbone of DSP operations — if things break, scale poorly, or require manual intervention, that's your problem to eliminate permanently.
Requirements
Key Responsibilities
Platform Engineering & Operations
-
Own end-to-end technical operations of DSP infrastructure
-
Ensure high availability, performance, and scalability of platform services
-
Monitor system health, troubleshoot issues, and implement permanent fixes (not patchwork)
Automation & Orchestration
-
Design and implement automation frameworks to eliminate manual processes
-
Build CI/CD pipelines and automate deployment workflows
-
Drive infrastructure-as-code (IaC) adoption using tools like Terraform/Ansible
Container & Cloud Platform Management
-
Manage and optimize OpenShift / Kubernetes environments
-
Work across multi-cloud (AWS, Azure, GCP) infrastructure
-
Ensure efficient resource utilization and cost optimisation
MLOps / Data Platform Support
-
Enable smooth ML model deployment and lifecycle management
-
Support tools like OpenShift AI, SageMaker, or similar platforms
-
Ensure reproducibility and reliability of data science workflows
Monitoring & Reliability
-
Implement monitoring using Prometheus, Grafana, ELK stack
-
Define SLAs, SLOs, and ensure platform meets reliability standards
-
Drive proactive incident prevention (not reactive firefighting)
Collaboration & Governance
-
Work closely with Data Scientists, DevOps, and Platform teams
-
Ensure adherence to security, compliance, and governance standards
-
Act as a technical SME for DSP operations
Mandatory Skills (Non-Negotiable)
-
Strong experience in OpenShift / Kubernetes
-
Hands-on experience in multi-cloud environments (AWS/Azure/GCP)
-
Expertise in automation (Terraform, Ansible, Jenkins, GitOps)
-
Strong knowledge of CI/CD pipelines and DevOps practices
-
Experience in Python or scripting (Bash/Shell)
-
Experience with monitoring tools (Prometheus, Grafana, ELK)
Good to Have
-
Experience in MLOps / AI platforms (OpenShift AI, SageMaker, Bedrock)
-
Exposure to LLM deployment / inference platforms (vLLM, Triton, etc.)
-
Knowledge of data pipelines and big data ecosystems
-
Banking or financial services experience
Experience Required
-
6-10 years of relevant experience in Platform Engineering / DevOps / Cloud Engineering
-
Proven experience managing enterprise-scale platforms