Qureos

FIND_THE_RIGHTJOB.

Site Reliability Engineer

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

Site Reliability Engineer

Location: Raleigh, NC, United States
Job Type: Contract (Onsite)
Hours: 40 hrs/week, Monday–Friday, 9:00 AM – 5:00 PM
Eligibility: Only U.S. Citizens and Green Card Holders (No H1B, OPT, CPT or other work visas)

About the Role:

We are seeking experienced Site Reliability Engineers (SREs) to ensure the reliability, scalability, and performance of critical enterprise platforms. This hands-on role requires expertise in cloud infrastructure, Linux/Windows systems, automation, and observability, and involves working closely with cross-functional teams to deliver highly available and resilient services.

Key Responsibilities:

  • Design, implement, and maintain reliable, scalable, and secure systems across cloud and on-prem environments.
  • Manage distributed systems running on Azure, Linux (RHEL7+), and Windows Server 2019+.
  • Build and enhance automation workflows using Python, Go, Bash.
  • Develop Infrastructure-as-Code (IaC) solutions with Terraform, Ansible, or similar tools.
  • Define, monitor, and improve SLIs, SLOs, and SLAs to ensure consistent service quality.
  • Reduce operational toil through automation, tooling enhancements, and process improvements.
  • Integrate systems with observability platforms for proactive issue detection.
  • Troubleshoot complex incidents, lead incident response, and conduct post-mortem analyses.
  • Collaborate with software engineering, infrastructure, and business teams to optimize system reliability, performance, and maintainability.

Requirements:

  • Proven experience as a Site Reliability Engineer or similar role in software engineering, infrastructure, or operations.
  • Hands-on experience with cloud platforms (Azure) and enterprise OS (Linux RHEL7+, Windows Server 2019+).
  • Knowledge of networking and storage (NFS, SAN, NAS).
  • Familiarity with DNS, LDAP, Kerberos, Centrify authentication services.
  • Proficiency in Python, Go, Bash scripting and automation.
  • Practical experience with Terraform, Ansible, or other IaC tools.
  • Ability to design, monitor, and improve SLIs, SLOs, and SLAs.
  • Experience integrating with modern observability platforms.
  • Strong communication and collaboration skills with cross-functional teams.
  • Calm, structured, and solution-oriented during high-pressure incidents.
  • Proactive, ownership-driven mindset with a focus on continuous improvement.

Skills:

Site Reliability Engineering, Azure, Linux (RHEL7+), Windows Server 2019+, Networking, NFS, SAN, NAS, DNS, LDAP, Kerberos, Centrify, Python, Go, Bash, Terraform, Ansible, IaC, Observability, SLIs/SLOs/SLAs, Automation, Incident Response, Metrics-Driven Reliability, System Performance, Cross-Functional Collaboration, Operational Excellence

Job Type: Full-time

Pay: $55.00 per hour

Work Location: In person

Similar jobs

No similar jobs found

© 2025 Qureos. All rights reserved.