Qureos

Find The RightJob.

Senior Site Reliability Engineer

🌍About SAQAYA


SAQAYA is a fast-growing international technology consultancy operating across the UK, Spain, and Egypt.


We partner with leading technology companies to build high-impact digital platforms by connecting exceptional engineers with ambitious product teams.

Our mission is simple: build outstanding technology by empowering outstanding people..


💼About the Product Environment


Our client operates within the financial market data space, delivering mission-critical production systems that demand high availability, stability, and performance.


The engineering culture emphasizes automation, reliability, and continuous improvement. You will work in a collaborative environment where infrastructure quality directly impacts product success and user trust.


This is a high-impact role where you will help shape production environments, improve deployment processes, and champion reliability best practices across teams.


🧩The Role


We’re looking for a Site Reliability Engineer who is passionate about automation, infrastructure design, and maintaining highly available production systems.


You will play a key role in ensuring system stability, improving deployment workflows, and collaborating closely with development and data teams to support evolving product requirements.

This role reports to the Head of Engineering.


If you enjoy solving complex operational challenges, automating everything possible, and driving long-term reliability improvements — this role is for you.


🔧What You’ll Be Working On

🚀 Production Reliability & Monitoring

  • Monitor and support production environments to ensure high availability and performance
  • Proactively identify operational issues before they impact users
  • Handle incident response, troubleshooting, and root cause analysis
  • Participate in an on-call rotation to ensure uptime



⚙️ Infrastructure & Automation

  • Build and maintain automation scripts for cloud-based deployments
  • Design and implement production infrastructure
  • Apply infrastructure-as-code practices (Terraform, Ansible, Puppet)
  • Improve deployment processes to promote stable and regular releases
  • Focus on automating repetitive operational tasks



📊 Observability & Performance

  • Implement and manage monitoring and logging solutions (Prometheus, Grafana, Nagios)
  • Define and monitor Service Level Objectives (SLOs) and SLAs
  • Manage capacity planning and performance optimization
  • Apply disaster recovery strategies, backups, and redundancy planning
  • Work with error budgets to balance innovation and reliability



🤝 Cross-Functional Collaboration

  • Work closely with Data Insights and Product Development teams
  • Promote DevOps culture and continuous integration/deployment practices
  • Improve production efficiency using engineering best practices
  • Ensure reliability and performance align with end-user expectations



🎯What We’re Looking For

🧠Core Competencies

  • Strong ownership mindset
  • Analytical and structured problem-solving approach
  • Customer-focused reliability thinking
  • Ability to drive long-term fixes, not temporary patches
  • Adaptable and eager to learn new technologies



💻Technical Requirements

  • Strong Python (or similar language) for automation and tooling
  • Solid Linux/Unix administration experience
  • Experience managing distributed systems
  • Cloud platforms: AWS, GCP, or Azure
  • Containerization (Docker) and orchestration (Kubernetes)
  • Strong networking fundamentals (TCP/IP, DNS, HTTP, load balancing, firewalls)
  • Monitoring and observability tooling
  • Infrastructure-as-code experience
  • Incident management and performance troubleshooting experience
  • Experience working with high-availability, production-critical systems



⭐Nice to Have

  • Experience with financial index benchmark systems
  • Familiarity with error budgets and reliability trade-offs



🌟Why This Role Is Exciting

  • High-impact role within mission-critical financial systems
  • Opportunity to influence production architecture and deployment strategy
  • Strong engineering culture focused on automation and reliability
  • Collaborative, product-driven environment
  • Real ownership and ability to effect meaningful change

© 2026 Qureos. All rights reserved.