Looking for candidate with strong experience in Datadog.
Experience Required:
- 4+ years of hands-on experience in Datadog Monitoring, Dashboards, Alerts, and Integrations
- Strong expertise in Datadog integration with AWS services (CloudWatch, Lambda, EC2, RDS, ECS/EKS, API Gateway, Load Balancer, S3, etc.)
- Good knowledge of infrastructure monitoring, APM (Application Performance Monitoring), and log management
- Strong troubleshooting, analytical, and communication skills
Responsibilities:
- Integration & Setup
- Configure and integrate Datadog with AWS accounts and services (EC2, ECS/EKS, Lambda, RDS, CloudFront, S3, API Gateway, etc.).
- Ensure seamless CloudWatch metrics ingestion into Datadog.
- Implement infrastructure, application, and log monitoring in line with best practices.
- Dashboards & Monitoring
- Design and build custom Datadog dashboards for real-time visibility of infrastructure, applications, and business KPIs.
- Create service-specific monitoring views (e.g., database health, API latency, container performance).
- Develop business-level dashboards for stakeholders (e.g., uptime SLAs, cost monitoring).
- Alerts & Incident Management
- Configure alerts and notifications for critical AWS resources and services.
- Define thresholds, anomaly detection, and predictive alerts to minimize downtime.
- Performance & Optimization
- Analyze performance data to identify bottlenecks, high latency, or resource inefficiencies.
- Provide recommendations for scaling and optimization.
- Set up APM for microservices and distributed tracing.
- Security & Compliance Monitoring
- Implement Datadog Security Monitoring rules for suspicious activity in AWS.
- Documentation & Knowledge Transfer
- Document all monitoring configurations, dashboards, and alerting logic.
- Provide training/knowledge transfer to internal teams for continued operations after engagement.
Job Type: Temporary
Contract length: 2 months