Qureos

FIND_THE_RIGHTJOB.

Site Reliability Engineer/Incident Manager

Pakistan

Overview

  • Own and manage critical customer escalations for large enterprise customers (primarily utility
  • companies & investor-owned utilities).
  • Act as the single point of contact during incidents — managing customer communication, hosting/leading calls, and tracking action items to closure.
  • Collaborate with Engineering, Support, and Cloud Operations to resolve incidents involving APIs, Node.js services, web servers, load balancers, Cloudflare CDN, and other infrastructure components.
  • Apply technical troubleshooting/programming background to accelerate resolution.
  • Domain knowledge of utility messaging and extreme weather notifications is a strong plus.

Key Responsibilities

Incident Management

  • Serve as primary customer contact during critical incidents.
  • Schedule and lead escalation calls, ensuring updates and agreed next steps are clear.
  • Track and drive all action items to closure.
  • Partner with Customer Success Managers (CSMs) to align escalation handling with account strategy.

Technical Oversight

  • Oversee resolution of incidents involving APIs, Node.js, servers, load balancers, Cloudflare CDN, and cloud infra.
  • Translate technical issues into business impact for customers.
  • Provide hands-on troubleshooting or code/configuration review if technically capable.

Cross-Functional Coordination

  • Collaborate with Engineering, Support, and Cloud Ops to break down complex issues into actionable steps.
  • Align on incident priorities, deadlines, and dependencies.
  • Escalate internally when blockers threaten resolution timelines.

Communication & Reporting

  • Send frequent, accurate status updates to customers and stakeholders.
  • Keep leadership informed of status, risks, and mitigation.
  • Document incident details, root cause, and resolution for post-incident reviews.

Post-Incident Activities

  • Validate resolution with customer before closing.
  • Conduct post-mortems to identify lessons learned and process improvements.
  • Update internal playbooks and knowledge base.

Qualifications

Required:

  • 5+ years in customer-facing roles in enterprise SaaS, technical support, or incident management.
  • 8+ years of IT experience.
  • Strong technical understanding of APIs, Node.js, servers, load balancers, Cloudflare CDN, cloud infra.
  • Proven track record coordinating high-severity incidents across multiple teams.
  • Exceptional communication skills for technical and executive audiences.
  • Strong organizational skills; ability to manage multiple escalations under pressure.
  • Flexibility to work across US Eastern/Pacific time zones.

Preferred:

  • Background as a programmer or technical engineer.
  • Experience with utility messaging & extreme weather notification systems.
  • Knowledge of ITIL incident management practices.
  • Familiarity with SaaS architecture, cloud hosting, distributed systems.
  • Understanding of utility industry regulations & operations

Job Type: Full-time

Application Question(s):

  • What's your current and expected salary?

Work Location: Remote

© 2025 Qureos. All rights reserved.