Qureos

Find The RightJob.

Infrastructure & Reliability Operations Lead

Job Description


Role Purpose:


The Head of Infrastructure and Reliability Operations is responsible for leading the organization’s infrastructure operations and ensuring the reliability, availability, and performance of enterprise IT platforms. This role oversees the operational management of core infrastructure services including compute, virtualization, storage, networking, operating systems, and platform services across on-premises and cloud environments.


The role focuses on delivering stable, secure, and highly available infrastructure services, establishing operational excellence, and implementing reliability engineering practices to ensure business-critical systems operate efficiently and meet defined service level objectives.


The position requires strong leadership in infrastructure operations, service reliability, incident management, and operational automation, with the goal of maintaining resilient platforms that support the organization’s digital services and business operations.


Key Accountabilities:


1-Lead the enterprise infrastructure operations:


Lead the enterprise infrastructure operations function responsible for compute, storage, network, virtualization, and platform services


2-Ensure the availability, stability, and performance:

Ensure the availability, stability, and performance of all infrastructure platforms supporting critical business applications.


3-Oversee the operation of datacenter infrastructure:

Oversee the operation of datacenter infrastructure and hybrid cloud environments.


4-Establish operational governance, standards:


Establish operational governance, standards, and best practices across infrastructure teams.


Reliability and Service Availability


5-Implement and lead Site Reliability Engineering (SRE):


Implement and lead Site Reliability Engineering (SRE) and operational reliability practices to ensure high service availability.


6-Define and monitor Service Level Objectives (SLOs):

Define and monitor Service Level Objectives (SLOs), Service Level Agreements (SLAs), and operational performance metrics.


Skills


  • Deep understanding of enterprise infrastructure technologies, including:

  • Server platforms and operating systems (Linux / Windows)

  • Virtualization platforms (VMware or similar)

  • Enterprise storage and backup solutions

  • Network infrastructure and connectivity

  • Experience with monitoring, observability, and infrastructure management tools.

  • Knowledge of high availability, disaster recovery, and capacity management practices.

  • Familiarity with automation tools and infrastructure management frameworks.

  • Understanding of cloud infrastructure platforms (GCP, OCI, AWS, or Azure).

Education

Bachelor’s degree in computer science, Information Technology, or related field.

Similar jobs

No similar jobs found

© 2026 Qureos. All rights reserved.