Senior Site Reliability Engineer

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

We are seeking a Senior Site Reliability Engineer to join our team and ensure the reliability, scalability, and efficiency of our systems.

You will work closely with developers and operations teams to support seamless user experiences and meet client expectations. In this role, you will deploy, maintain, and automate infrastructure and application environments while driving continuous improvement in operational practices. If you have a strong background in SRE and cloud technologies, we encourage you to apply and contribute to our mission of delivering high-quality solutions.

Responsibilities

Collaborate with development, security, quality, and operations teams to apply site reliability engineering practices
Define and maintain reliability, availability, and performance targets for services and applications
Troubleshoot and resolve infrastructure and application issues promptly
Implement monitoring systems to track infrastructure and application reliability
Manage service level objectives, error budgets, and incident management processes
Automate repetitive tasks to reduce operational toil
Support capacity planning and performance optimization efforts
Conduct postmortem analyses to identify and address root causes of incidents
Drive continuous improvement initiatives in operational procedures and reliability engineering

Requirements

Bachelor’s degree in computer science, engineering, or related field
Proven experience with cloud platforms such as AWS, GCP, or Azure
Experience implementing site reliability engineering practices including SLO/SLI, error budgets, postmortems, toil reduction, capacity planning, and incident management
Knowledge of Python or similar scripting/programming languages
Strong background in monitoring tools and techniques
Proficiency with continuous integration and continuous delivery tools, infrastructure as code, and configuration management
Solid knowledge of container orchestration technologies such as Kubernetes and Docker
Strong written and verbal English communication skills (B2+)

Nice to have

Expertise in deployment and management of large language models including retrieval-augmented generation (RAG)
Certifications in Kubernetes, AWS, GCP, Azure, or related technologies
Experience in DevOps practices and tools
Knowledge of AI/ML model management including deployment, monitoring, and maintenance

We offer

CONTINUOUS UPSKILLING, LEARNING & DEVELOPMENT
- Diversity of tasks and projects
- Assessment center for objective review of competency level
- Personal development plan
- Mentoring programs and leadership development
- Certification and professional development support
- Access to learning platforms including more than 2,500 internal courses and the LinkedIn Learning library with 20,000+ courses
- English courses taught by certified teachers
CORPORATE BENEFITS
- Extra leave days
- Referral bonuses
COMPENSATION PACKAGE
- Competitive compensation paid in USD
- Regular salary and performance reviews
MEDICAL & HEALTHCARE
- Private health insurance
- Well-being events
WORKING ENVIRONMENT
- Recreation areas and kitchens
- Tea, coffee, and snacks
- Well-being events
- Sports equipment and game consoles
- IT Equipment
- Microsoft's Software Assurance Home Use Program (HUP)

EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

Similar jobs

System Engineer Executive (Fintech) - Hybrid

Delivery Hero

Istanbul, Turkey

5 days ago

Site Reliability Engineer

Picus

Ankara, Turkey

5 days ago

Sistem Mühendisi / Altyapı Destek Uzmanı

Baykar

Istanbul, Turkey

5 days ago

Senior Software Engineer (Go)

TechBiz Global GmbH

Egypt

5 days ago

Senior MLOps Engineer with Python

EPAM Systems

Turkey

5 days ago

Term of use Privacy policy