FIND_THE_RIGHTJOB.

Consulting Member of Technical Staff (IC5)

United States

The ideal candidate will be technically adept, with a proven track record of building highly available, scalable, and redundant services. You understand the value of simplicity in system design, knowing that it makes systems easier to operate, troubleshoot, and scale. You can balance speed and quality, embracing iteration and continuous improvement.

You should have a strong background in operating high-scale services with a deep understanding of how to improve their resilience, performance, scalability, and overall customer experience. You’ll be expected to work independently, drive projects to completion, and provide technical leadership to development teams.

In this role, you’ll contribute to the platform's design and development, overseeing in-house engineering, design reviews, system integration, and operational enhancements. This role isn’t just about shipping features—it’s about elevating engineering standards within OCI. You’ll lead large, impactful projects, collaborate across engineering, product, and operations teams, and mentor engineers at all levels, helping shape a vision for exceptional cloud infrastructure.

As a Consulting Member of Technical Staff (IC5), you will provide technical leadership for Oracle’s messaging and eventing ecosystem — including but not limited to Oracle Streaming, Oracle Queue, and Oracle Streaming Service with Apache Kafka services. You will define the architecture, reliability, and scalability strategy for these core services, enabling event-driven and streaming workloads across Oracle Cloud Infrastructure (OCI).

You will:

Architect, design, and operate distributed, highly available, and resilient systems supporting real-time data ingestion, message queuing, and stream processing at massive scale.
Define and drive the technical roadmap for Streaming, Queue, and Managed Kafka services.
Lead system design for multi-tenant, horizontally scalable, and cost-efficient architectures that deliver consistent latency, throughput, and durability across OCI regions.
Collaborate cross-functionally with storage, networking, observability, and security teams to deliver new platform features, enforce secure-by-default designs, and improve overall fleet reliability.
Mentor and guide engineers in distributed systems design, high-scale data processing, and operational excellence; set and raise engineering standards across multiple teams.
Drive operational excellence by owning service-level objectives (availability, latency, durability) and reducing toil through automation, observability, and self-healing mechanisms.
Own the full service lifecycle from design and implementation to deployment, on-call, and continuous improvement — maintaining high code and reliability standards.
Partner with product management and field teams to translate customer needs into roadmap priorities for Oracle Streaming and Queue services.
Contribute to the broader platform vision , influencing how Oracle’s messaging and eventing services evolve to support mission-critical workloads globally.

Must Have Qualifications:

15+ years of professional experience developing and operating large-scale, distributed systems or cloud-native services.
Deep expertise in Apache Kafka , including Raft/Zookeeper/KRaft internals, performance, latency and operating production Kafka clusters at scale.
Strong hands-on experience with message queuing systems such as RabbitMQ, ActiveMQ, or equivalent enterprise queue technologies, including understanding of AMQP protocols and queue semantics (FIFO, DLQ, fan-out, and priority).
Hands-on experience with Kubernetes , including deployment, scaling, and operating stateful workloads in containerized environments.
Proficiency in Java, Go, or similar object-oriented languages; ability to produce high-quality, performant, and maintainable code.
Experience with operating at scale — production debugging, performance tuning, capacity modeling, and regional failover strategies.
Demonstrated technical leadership , influencing architecture and execution across multiple teams, and mentoring other senior engineers. Excellent communication skills, able to articulate complex designs and trade-offs clearly across engineering and product stakeholders.
Experience with cloud platforms (OCI, AWS, Azure, GCP) and modern deployment frameworks (Kubernetes, Terraform, CI/CD).

Nice to Have Qualifications:

Experience designing or operating Tier-0 or mission-critical services , with stringent SLAs for availability, latency, and durability.
Experience contributing to or extending open-source messaging systems (Kafka, RabbitMQ, Pulsar, Flink).
Familiarity with observability stacks (Prometheus, OpenTelemetry, Grafana) and operational excellence principles (SLOs, SLIs, error budgets).
Understanding of OCI-specific service s, IAM integration, and region/fault-domain isolation models.

Similar jobs