Qualifications:
- Manage and automate infrastructure using tools like Terraform, Ansible, or CloudFormation.
- Monitor system performance and troubleshoot issues using tools like Prometheus, Grafana, or Datadog.
- Manage and deploy containerized applications using Docker and Kubernetes.
- Collaborate with development and operations teams to enhance deployment processes.
- Implement security best practices for infrastructure and applications.
- Maintain cloud environments (AWS, Azure, or GCP) and perform cost optimization.
Key Requirements:
- 3+ years of hands-on experience managing production workloads on Amazon Web Services, including designing, deploying, and operating highly available and fault-tolerant systems.
- Strong ownership of CI/CD systems, with proven experience designing and maintaining pipelines for automated build, test, and deployment, including rollback strategies, environment isolation, and zero-downtime deployments.
- Deep understanding of cloud networking, including VPC design, public/private subnet segmentation, NAT gateways, Internet Gateways, route tables, DNS resolution, and security groups. Ability to design secure and scalable network architectures from scratch.
- Extensive experience with core AWS services, including load balancing (ALB), compute (EC2, ECS), databases (RDS), Auto Scaling Groups, along with strong working knowledge of S3, CloudFront, SNS, IAM (least privilege design), CloudWatch, Route 53, and Lambda for event-driven workflows.
- Strong expertise in Infrastructure as Code, particularly Terraform, including modular design, state management, and environment separation. Experience with AWS CDK is a plus.
- Hands-on experience with observability and monitoring, including:
- Centralized logging using Grafana and Loki. Metrics, alerting, and incident response using CloudWatch, SNS, and Lambda-based automation
- Solid understanding of cloud security and compliance, with experience implementing best practices aligned with standards such as SOC 2, PCI-DSS, or GDPR, including IAM policies, secrets management, and secure data handling.
- Proven ability to optimize cloud infrastructure for cost, performance, and scalability, including rightsizing resources, implementing autoscaling strategies, and optimizing storage and data transfer costs.
- Hands-on experience with configuration management tools such as Ansible, including provisioning, automation, and system configuration at scale.
- Strong experience with containerization, including building, optimizing, and managing containerized workloads using Docker and orchestration via ECS.
- Good hands-on experience with Kubernetes, including production-grade deployments on EKS and Rancher, workload management, autoscaling, networking, and troubleshooting.
- Ability to produce and maintain clear, structured, and up-to-date technical documentation for infrastructure, processes, and runbooks.
- Working knowledge of JavaScript ecosystems and microservices architectures, with the ability to collaborate effectively with backend teams and debug application-level issues when needed.
- Must ensure effective incident response with proper documentation, including clear root cause analysis (RCA), resolution steps, and preventive actions.
Age Limit:23-26
Job Type: Full-time
Pay: Rs100,000.00 - Rs200,000.00 per month
Work Location: In person