At American Express, our culture is built on a 175-year history of innovation, shared At American Express, our culture is built on a 175-year history of innovation, shared values and Leadership Behaviors, and an unwavering commitment to back our customers, communities, and colleagues. As part of Team Amex, you'll experience this powerful backing with comprehensive support for your holistic well-being and many opportunities to learn new skills, develop as a leader, and grow your career.
Here, your voice and ideas matter, your work makes an impact, and together, you will help us define the future of American Express.
How will you make an impact in this role?
Key Responsibilities:
- Manages the collaboration with Software Engineering teams to design, develop, and implement features that enhance system resilience, scalability, and performance, proactively identifying and resolving system bottlenecks and failure points
- Develops and refines sophisticated automation tools and frameworks, including advanced infrastructure as code (IaC) practices, to streamline operational workflows, deployment processes, and infrastructure management, ensuring high system efficiency
- Engages in architectural design discussions, ensuring that advanced reliability, scalability, and performance considerations are integrated into strategic decision-making processes
- Designs and executes comprehensive chaos engineering experiments and advanced resiliency testing, analyzing results to implement robust improvements that enhance system robustness and recovery capabilities
- Develops, optimizes, and maintains comprehensive disaster recovery plans and business continuity strategies, ensuring systems can recover quickly and effectively from complex and unexpected disruptions
- Advocates for observability practices by promoting and implementing best practices such as error budgeting, service-level objectives (SLOs), and service-level indicators (SLIs), contributing to a culture of continuous improvement and reliability
- Collaborates and co-creates effectively with teams in product and the business to align technology initiatives with business objectives
Qualifications
- Bachelor's degree in computer science, Information Technology, Engineering, and/or comparable experience; advance degree preferred
- Knowledge of modern observability stack - Splunk, Elastic Search, Prometheus, Grafana
- Knowledge of containerization technologies (e.g., Kubernetes, Docker) and microservices architecture
- Knowledge of observability tools and methodologies, including experience with logging, monitoring, tracing, and performance analysis platforms
- Knowledge of cloud-based Site Reliability Engineering (SRE) practices and experience with public cloud platforms such as AWS, Azure, or Google Cloud
We back you with benefits that support your holistic well-being so you can be and deliver your best. This means caring for you and your loved ones' physical, financial, and mental health, as well as providing the flexibility you need to thrive personally and professionally:
- Competitive base salaries
- Bonus incentives
- Support for financial-well-being and retirement
- Comprehensive medical, dental, vision, life insurance, and disability benefits (depending on location)
- Flexible working model with hybrid, onsite or virtual arrangements depending on role and business need
- Generous paid parental leave policies (depending on your location)
- Free access to global on-site wellness centers staffed with nurses and doctors (depending on location)
- Free and confidential counseling support through our Healthy Minds program
- Career development and training opportunities
American Express is an equal opportunity employer and makes employment decisions without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran status, disability status, age, or any other status protected by law.
Offer of employment with American Express is conditioned upon the successful completion of a background verification check, subject to applicable laws and regulations.