Experience: 10–12 Years
Location: Egypt (Onsite Role)
Employment Type: Full-Time
We are seeking a highly experienced Lead MLOps Architect with deep AWS expertise to lead the design, architecture, and governance of enterprise-grade ML platforms. This role requires strong leadership capabilities, hands-on expertise in scalable ML systems, and experience managing large production environments.
-
Architect and lead enterprise-scale MLOps platforms on AWS
-
Define best practices for ML lifecycle management, deployment standards, and governance
-
Lead production deployment of ML models using AWS-native services
-
Design automated CI/CD pipelines for ML workflows and infrastructure
-
Implement advanced monitoring, drift detection, retraining automation, and observability
-
Ensure high availability, scalability, security, and cost optimization
-
Establish model versioning, reproducibility, and experiment tracking standards
-
Lead troubleshooting of complex production issues
-
Mentor and lead a team of MLOps and platform engineers
-
Collaborate with stakeholders to align ML platform strategy with business objectives
-
10–12 years of overall experience with strong focus on ML production systems
-
Proven experience leading ML platform architecture and large-scale deployments
-
Deep understanding of ML lifecycle management, governance, and reproducibility
-
Hands-on experience with TensorFlow, PyTorch, Scikit-learn
-
Strong experience with MLflow or enterprise model management tools
-
Advanced hands-on expertise in:
-
Amazon SageMaker (training, pipelines, endpoints)
-
S3, EC2, Lambda
-
ECR, ECS, EKS
-
IAM, CloudWatch
-
Experience designing secure, compliant, and scalable ML architectures
-
Experience implementing cost optimization strategies on AWS
-
Strong expertise in Docker and Kubernetes (EKS)
-
Advanced CI/CD implementation
-
Infrastructure as Code using Terraform and/or CloudFormation
-
Experience implementing GitOps practices
-
Expert-level Python skills
-
Experience designing robust data pipelines
-
Strong understanding of SQL/NoSQL systems
-
Exposure to streaming or real-time ML systems
-
AWS Professional-level certifications
-
Experience with ML security, explainability, and regulatory compliance
-
Experience building enterprise feature stores
-
Exposure to real-time inference systems
XXWxOZTgSw