Qureos

FIND_THE_RIGHTJOB.

Site Reliability Engineer

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

Urgently hiring for Site Reliability Engineer (SRE) / Chaos Engineer

Location: Hyderabad
Job Type: Full-time, Permanent

Job Description:
We are looking for an experienced Site Reliability Engineer (SRE) with strong Python automation skills (Boto3 required) and hands-on experience in chaos engineering to improve system reliability and resilience. The ideal candidate should have a solid background in cloud infrastructure, DevOps, and automation tools.

Responsibilities:

  • Design and run chaos experiments using Gremlin, AWS FIS, Litmus, or Chaos Mesh.
  • Develop automation scripts and tools in Python for reliability testing and infrastructure management.
  • Manage and automate infrastructure using Terraform or CloudFormation.
  • Deploy and manage containers using Docker, Kubernetes, and EKS.
  • Enhance system monitoring, observability, and incident response processes.
  • Collaborate with DevOps teams to maintain CI/CD pipelines (Git, Jenkins, GitLab, CodePipeline).

Required Skills:

  • Advanced Python development (Boto3 a must).
  • Experience with chaos testing and reliability engineering.
  • Strong understanding of IaC, containers, and orchestration tools.
  • Knowledge of monitoring and incident management tools.
  • Understanding of resilience, disaster recovery, and high-availability design.

Preferred Skills:

  • Knowledge of Go or Shell scripting.
  • Experience with multi-cloud or hybrid-cloud setups.
  • Strong analytical, problem-solving, and debugging skills.

Job Types: Full-time, Permanent

Pay: ₹2,000,000.00 - ₹3,000,000.00 per year

Experience:

  • Chaos testing: 3 years (Required)
  • Python: 3 years (Required)

Location:

  • Hyderabad, Telangana (Required)

Work Location: In person

© 2025 Qureos. All rights reserved.