Journey with us! Combine your career goals and sense of adventure by joining our exciting team of employees Royal Caribbean Group is pleased to offer a competitive compensation and benefits package and excellent career development opportunities each offering unique ways to explore the world
We are proud to be the vacation-industry leader with global brands — including Royal Caribbean International Celebrity Cruises and Silversea Cruises — the most innovative fleet and private destinations and the best people Together we are dedicated to turning the vacation of a lifetime into a lifetime of vacations for our guests
The Royal Caribbean Group’s AI & Analytics Team has an exciting career opportunity for a full time Senior Engineer AIOps reporting to the Senior Manager Data Intelligence Operations
The position is onsite and based in Miramar Florida
The position is also not eligible for work authorization sponsorship
Position Summary
The Senior Engineer AIOps serves as a technical anchor for the reliability scalability and continuous improvement of Royal Caribbean Group’s enterprise AI Generative AI (GenAI) and modern data platforms This senior-level role leads incident response drives operational maturity mentors junior team members and partners with platform engineering and data science teams to shape how AI and data systems are built deployed and maintained at scale The ideal candidate brings deep expertise in Microsoft Azure and Databricks strong command of LLM and GenAI tooling and the judgment to make sound architectural and operational decisions independently
Essential Duties and Responsibilities
Leads the operational health and reliability of enterprise AI GenAI and data platforms
- ensuring high availability and performance
Serves as the senior technical escalation point for L2/L3 production issues across AI and GenAI-enabled applications- including LLM-based services and RAG pipelines
Designs and owns observability strategies for AI platform health covering availability latency throughput cost attribution- and model behavior drift
- Leads root cause analysis for complex AI inference failures and drives permanent remediation across engineering and product teams
Evaluates onboards and operationalizes new GenAI capabilities including Azure OpenAI Service Foundation Model APIs- and vector store solutions
Defines operational standards SLAs and runbooks for AI platform services- championing a proactive operations culture
Builds and operates AIOps pipelines that leverage GenAI to analyze incidents identify failure causes- and recommend remediation actions
Integrates AIOps insights into CI/CD pipelines- validating deployments against known failure patterns and implementing AI-driven quality gates
Owns the operational health of enterprise data platforms built on Azure and Databricks including governance table management- and job orchestration
Leads cloud cost governance efforts for Databricks and Azure services- partnering with FinOps to optimize spend
Enforces and continuously improves platform security posture including RBAC managed identity network policies- and secrets management
Leads major incident response for platform outages produces high-quality RCAs- and drives post-incident improvements
Mentors and guides junior engineers contributing to hiring onboarding- and skills development within the AI Ops team
Qualifications Knowledge and Skills
Bachelor’s degree in Computer Science Engineering
- or related field required; Master’s degree preferred
7+ years of experience in platform operations cloud engineering AI/data platform support
- or site reliability engineering in enterprise environments
Deep hands-on experience with Microsoft Azure including Azure OpenAI Service Azure AI Search Azure Data Factory Azure Monitor- and related data and AI services
Expert-level experience with Databricks including Unity Catalog administration cluster and pool management Delta Lake operations- and job orchestration at scale
Strong command of LLM and GenAI concepts including inference architectures RAG pipelines embeddings vector databases- and model serving patterns
Proficiency in Python and SQL- with experience automating operational tasks and reviewing pipeline and application code
Demonstrated ability to lead incident response independently produce high-quality RCAs- and drive cross-functional remediation
- Experience with ITSM platforms (ServiceNow preferred) and formal incident and change management processes
Strong communication skills able to translate complex technical issues into clear- actionable updates for both technical and non-technical stakeholders
Expertise in AI and data platform operations observability
- and incident management
- Proficiency in cloud cost optimization and FinOps practices
Experience with CI/CD pipelines DevOps practices- and automation tools
Strong understanding of platform security governance- and compliance requirements
- Demonstrated ability to mentor and guide junior engineers
Strong organizational analytical- and problem-solving skills
- Ability to foster a culture of operational excellence and continuous improvement
- Effective collaborator with cross-functional teams and external partners
Agency and Third-Party Submissions: Please note this is a direct search by the Company and applications through agencies and other third parties will not be accepted nor will fees be paid for unsolicited resumes Any unsolicited resumes will be considered the Company's property
We know there's a lot to consider As you go through the application process our recruiters will be glad to provide guidance and more relevant details to answer any additional questions Thank you again for your interest in Royal Caribbean Group We'll hope to see you onboard soon!
It is the policy of the Company to ensure equal employment and promotion opportunity to qualified candidates without discrimination or harassment on the basis of race color religion sex age national origin disability sexual orientation sexuality gender identity or expression marital status or any other characteristic protected by law Royal Caribbean Group and each of its subsidiaries prohibit and will not tolerate discrimination or harassment