FIND_THE_RIGHTJOB.
JOB_REQUIREMENTS
Hires in
Not specified
Employment Type
Not specified
Company Location
Not specified
Salary
Not specified
● On-prem infrastructure management
Manage on-prem infrastructure. Maintain uptime, reliability and readiness of on-prem engineering cloud spread across multiple data centers. Implement monitoring, alerting, and incident response procedures to ensure adherence to defined performance targets. Perform root cause analysis and post-mortems of incidents for any threshold breaches.
● Observability
Set up and manage monitoring and logging tools such as Prometheus, Grafana, or the ELK Stack to oversee system health and performance. Maintain KPI pipelines using Jenkins, Python and ELK.
Improve monitoring systems by adding custom alerts based on business needs.
● Tech stack
Baremetal data center machine management tools like IPMI, Redfish, KVM etc.
Automation using Jenkins, Python, Go, Bash.
Infrastructure tools like Kubernetes, MySQL, Prometheus, Grafana and ELK.
Any familiarity with hardware like GPU & Tegras is a plus
Job Types: Full-time, Contract
Pay: $150,000.00 - $160,000.00 per year
Work Location: In person
Similar jobs
Amazon.com
San Francisco, United States
5 days ago
Amazon.com
Seattle, United States
5 days ago
Amazon.com
Seattle, United States
5 days ago
Amazon.com
Palo Alto, United States
5 days ago
Amazon.com
Seattle, United States
5 days ago
Amazon.com
San Francisco, United States
5 days ago
Amazon Web Services
Houston, United States
5 days ago
© 2025 Qureos. All rights reserved.