Find The RightJob.
Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.
Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the affect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.
A BS or MS in Computer Science, or equivalent. Identifies solutions to knowledge of server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance. Experience running large scale customer facing web services. Identifies solutions to understanding of load balancing technologies and experience with development in programming languages, databases and big data stores, and container technologies. Work involves defining and documenting technical architecture of complex and highly scalable products. A minimum of 5+ years experience of running large scale customer facing web services.
Responsibilities
Oracle Cloud Infrastructure (OCI) delivers mission-critical applications for top tier enterprises around the world. Our cloud offers unmatched hyper-scale, multi-tenant services deployed in more than 30 regions worldwide. OCI is expanding its mission beyond the traditional boundaries of public cloud to include dedicated, hybrid and multi cloud, edge computing, and more.
At Multicloud Services organization, our mission is to support customer choice, transparency, and value when it comes to cloud infrastructure. We make it easy for our customers to maximize the value of their Oracle investment as well as other clouds or on-premises infrastructure and build highly distributed, scalable, and resilient Multicloud solutions to support their business.
We are looking for hands-on engineers with expertise and passion in solving difficult problems in all areas of cloud service software engineering: high scale distributed systems, virtualized infrastructure, identity, security, observability, and user experience.
We are growing fast, still at an early stage, and working on ambitious new initiatives. An engineer at any level can have significant technical and business impact here. You will be part of a team of smart, motivated, diverse people, and given the autonomy as well as support to do your best work. It is a dynamic and flexible workplace where you’ll belong and be encouraged.
Who are we looking for?
We are looking for a Site Reliability Engineer who will operate and help develop tools for the multi-cloud services. You should be comfortable at defining how to use the latest technologies to identify and optimize operational efficiency. You will be responsible for the infrastructure and reliability of all multi-cloud and other network monitoring services. You should value simplicity and scale, work comfortably in a collaborative, agile environment, and be excited to learn.
A great SRE will make all the difference for delivering quality solutions to our customers. You will be the subject matter expert for VMware on OCI who is able to resolve complex on-premises to OCI migrations, customer escalations and be the Tier-2 point-of-contact for support.
Are you passionate about designing, developing, testing, and delivering infrastructure for cloud services? Do you thrive in a fast-paced environment, and want to be an integral part of a truly great team? If yes, come join us!
Qualifications:
Preferred Qualifications
Similar jobs
lululemon
Seattle, United States
1 day ago
Capgemini
Atlanta, United States
2 days ago
Etsy
Brooklyn, United States
11 days ago
DoorDash
San Francisco, United States
11 days ago
TekSynap
Reston, United States
11 days ago
DocuSign
San Francisco, United States
11 days ago
Coretelligent
Boston, United States
11 days ago
© 2026 Qureos. All rights reserved.