Qureos

Find The RightJob.

NOC Site Reliability Engineer (SRE) - (AM/MGR)

We are looking for a proactive NOC Site Reliability Engineer (SRE) to join our Infrastructure and Operations team. This role is critical in ensuring the reliability, security, and availability of Hugo Bank’s IT infrastructure, spanning cloud, on-premises, and networking environments. The successful candidate will have hands-on expertise in AWS cloud, MPLS networks, virtualization, Kubernetes, and monitoring tools, with a strong focus on compliance, SOP adherence, and Root Cause Analysis (RCA). You will collaborate with multiple teams to maintain operational excellence, optimize systems, and ensure compliance with regulatory and internal policies.

Core Responsibilities:

  • Monitor, manage, and optimize IT infrastructure, including MPLS L2/L3 links, firewalls, switches,

access points, and SSL certificates.

  • Maintain and monitor AWS cloud resources, VMware virtualized environments, and Kubernetes

clusters.

  • Monitor and ensure availability of critical services including ABL, NLB, DNS, API endpoints, and

websites.

  • Implement routine patching, vulnerability assessments, anti-virus monitoring, and firewall health

checks.

  • Enforce and improve Standard Operating Procedures (SOPs) for NOC operations, ensuring

compliance with internal policies and regulatory frameworks.

  • Perform Root Cause Analysis (RCA) for incidents and provide actionable recommendations to

prevent recurrence.

  • Maintain dashboards and reports for system health, performance, and compliance audits using

monitoring and observability tools such as:

  • SolarWinds NMS for network monitoring
  • Grafana / Prometheus for metrics visualization
  • Elasticsearch / ELK Stack for log aggregation and analysis
  • AWS CloudWatch / CloudTrail for cloud monitoring and auditing
  • AWS Inspector for vulnerability assessment
  • Nagios / Zabbix or similar NOC monitoring platforms
  • Collaborate with security, development, and operations teams to improve service reliability,

availability, and performance.

  • Participate in on-call rotations and provide support for critical incidents to ensure business

continuity.

Requirements

  • Bachelor’s degree in Information Technology, Computer Science, or a related field.
  • A minimum of 3 years of experience in NOC, SRE, or cloud/infrastructure operations, preferably

in the financial sector.

  • Proven experience with AWS cloud, VMware, Kubernetes, and containerized application

management.

  • Hands-on experience with network monitoring, firewalls, switches, access points, and MPLS

networks.

  • Experience with backup systems, patching, vulnerability management, and SSL certificate

monitoring.

  • Strong knowledge of incident management, RCA, compliance, and SOP implementation.
  • Familiarity with regulatory requirements and IT operations compliance frameworks.

Preferred Certifications:

  • AWS Certified Cloud Practitioner
  • Kubernetes certifications (CKA, CKAD)
  • Networking/Security certifications (CCNA, Fortinet NSE)

© 2026 Qureos. All rights reserved.