FIND_THE_RIGHTJOB.
India
Project Role : Infrastructure Engineer
Project Role Description : Assist in defining requirements, designing and building data center technology components and testing efforts.
Must have skills : Infrastructure Automation
Good to have skills : Python (Programming Language), Work Load Automation Architecture and Design, Automation Architecture
Minimum 5 year(s) of experience is required
Educational Qualification : 15 years full time education
Summary: The Server SRE is responsible for ensuring the reliability, scalability, and performance of server infrastructure. This role combines software engineering, development and systems engineering to automate operations, manage incidents, achieve a noise-free environment. The candidate will do the automation development work and work closely with infrastructure teams to implement observability and automation solutions.. Must Have Skills - Strong experience in Linux/Unix server administration - Proficiency in automation development using Python, Bash, or Shell scripting - Hands-on experience with monitoring tools such as Prometheus, Grafana, SolarWinds - Ability to analyze incidents and problems to reduce alert noise - Experience with CI/CD pipelines , GitHub and DevOps practices - Hands-on experience in creation of IAC (development using using Terraform / Anisble) - Familiarity with server performance metrics and observability tools Good to Have Skills - Experience with cloud platforms (AWS, Azure, GCP) - Knowledge of container orchestration (e.g., Kubernetes) - Familiarity with infrastructure as code tools (e.g., Terraform, Ansible) - Exposure to incident management frameworks (e.g., ITIL, SRE principles) Job Requirements Minimum of 7 years of experience in server administration and reliability engineering. Strong analytical skills and ability to work in a fast-paced environment. Must be able to implement automation and monitoring solutions and analyze incidents to maintain system stability. Key Responsibilities - Monitor and maintain server health across environments - Automate operational tasks and reduce manual interventions - Implement observability solutions including metrics, logging, and tracing - Analyze incidents and perform root cause analysis - Collaborate with teams to improve system reliability and reduce alert noise - Design scalable server architectures for high availability - Conduct capacity planning and performance tuning Technical Experience Hands-on experience with server monitoring and automation tools. Strong scripting skills and familiarity with observability platforms. Experience in analyzing incidents and implementing solutions to reduce noise and improve reliability. Professional Attributes Excellent problem-solving and analytical skills. Strong communication and collaboration abilities. Proactive mindset with a focus on continuous improvement and operational excellence. Educational Qualification and Certification Bachelor’s Degree in Computer Science, Information Technology, or related field. Certifications in Linux administration, cloud platforms, or SRE practices are a plus.
Similar jobs
Microsoft
Bengal, India
5 days ago
Ahead
India
5 days ago
Kirat Plastics Pvt. Ltd
India
5 days ago
Indegene
India
5 days ago
DocuSign
India
5 days ago
WSP
Uttar Tola, India
5 days ago
Salesforce
India
5 days ago
© 2025 Qureos. All rights reserved.