We are a fast-growing UAE-based real estate property portal operating a production-grade cloud infrastructure on AWS. We are looking for a hands-on DevOps Engineer to own, maintain, and evolve our infrastructure — someone who has worked in real EKS environments, not just theoretical Kubernetes experience. You will be the primary owner of our cloud infrastructure and CI/CD pipelines, working directly with the engineering team to ensure platform stability, scalability, and security.
What You’ll Actually Work On
Our stack is not generic. You will be expected to hit the ground running on:
- AWS EKS — Multi-AZ self-managed and managed node groups across 3 availability zones, with EC2 worker nodes.
- Multi-region AWS — Production runs on me-central-1 (UAE); familiarity with cross-region architecture is a plus
- Kubernetes workloads — Managing Prod-Backend, Prod-Frontend, and internal services as containerized deployments with NGINX Ingress and API Gateway routing
- Aurora PostgreSQL — Multi-AZ cluster with Writer, Read Replica, and Reader nodes across DB subnets; performance tuning and failover management
- MongoDB & Redis — Operational management, backup policies, and performance optimization
- Elasticsearch — Maintaining search infrastructure integrated with our property listing engine
- Networking — VPC design, public/private subnet segmentation, NAT Gateways with Elastic IPs, Network Load Balancer (NLB), Internet Gateway, and Security Groups
- Monitoring — Prometheus stack extending alerting and dashboards for proactive incident detection
- Third-party integrations — S3, Pusher, Twilio, SendGrid, Maqsam, SleekFlow, Stripe — ensuring reliable connectivity and uptime from the infrastructure side
- CI/CD pipelines — Building and maintaining deployment pipelines for our backend and frontend microservices
Key Responsibilities
- Manage and optimize Skyloov’s AWS EKS cluster infrastructure including node groups, IAM/IRSA, and Kubernetes workloads
- Maintain high availability of Aurora PostgreSQL, MongoDB, and Redis — including backup, monitoring, and failover procedures
- Manage NGINX Ingress, NLB, NAT Gateways, and all networking components across 3 AZs
- Build and maintain CI/CD pipelines for continuous delivery of backend and frontend services
- Write and maintain automation scripts in Bash and Python for operational tasks
- Proactively monitor infrastructure via Prometheus; set up alerting to prevent incidents before they impact users
- Manage Elasticsearch cluster performance as it relates to property search functionality
- Maintain and improve the DR (Disaster Recovery) runbook and conduct periodic DR tests
- Manage cost optimization across AWS services (compute, storage, data transfer, third-party API costs)
- Enforce security best practices: IAM least-privilege, Security Group hygiene, secrets management
- Support the development team with environment setup, debugging, and incident response
Required Qualifications
- 3–5 years in a DevOps or Cloud Infrastructure role
- Hands-on AWS EKS experience is mandatory — candidates without real Kubernetes production experience will not be considered
- Strong working knowledge of: EC2, EKS, RDS Aurora, ElasticSearch, S3, IAM, VPC, NLB, NAT Gateway
- Proficiency in Docker and Kubernetes (deployments, services, ingress, secrets, configmaps, HPA)
- Experience with Prometheus or equivalent monitoring stack
- Solid scripting skills: Bash and Python
- Infrastructure-as-Code: Terraform or CloudFormation (either is acceptable; Terraform preferred)
- Database experience: PostgreSQL and MongoDB administration (not just development)
- Strong understanding of VPC networking, subnetting, routing, and security groups
- Familiarity with CI/CD tooling (GitHub Actions, GitLab CI, or equivalent)
- Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience
Nice to Have
- Experience with KEDA (Kubernetes Event-Driven Autoscaling)
- Prior work in a SaaS or marketplace product environment
- Exposure to AWS Middle East (me-central-1) region
- Experience managing Elasticsearch at scale for search-heavy applications
- Familiarity with any of: Pusher, SleekFlow, Maqsam, Twilio (from infrastructure/integration side)
Experience: 3-5 years
Work Location: Remote