Qureos

FIND_THE_RIGHTJOB.

Support Engineer

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

Equifax is seeking creative, high-energy and driven software engineers with hands-on development skills to work on a variety of meaningful projects. Our software engineering positions provide you the opportunity to join a team of talented engineers working with leading-edge technology. You are ideal for this position if you are a forward-thinking, committed, and enthusiastic software engineer who is passionate about technology.

What you’ll do

Monitoring & Observability (Datadog-Focused)

  • Own Observability: Design, implement, and maintain a comprehensive monitoring strategy using Datadog (Metrics, APM, Logs, Synthetics, and RUM).
  • Proactive Detection: Build and refine sophisticated dashboards, SLOs/SLIs, and alerts to identify performance bottlenecks and potential failures before they become customer-facing incidents.
  • Analyze: Use Datadog's full suite to trace complex issues across distributed microservices, from the front-end to the database.

Production Support & Incident Management

  • Incident Command: Act as the technical lead during high-priority production incidents, coordinating cross-functional teams (Development, DevOps, Product) to drive rapid resolution.
  • Root Cause Analysis (RCA): Conduct thorough, blameless post-mortems to identify the true root cause of incidents, documenting findings and tracking remedial actions.
  • On-Call: Participate in a rotating on-call schedule, serving as the primary escalation point for all production service issues.
  • War Room Leadership: Confidently manage "war room" scenarios, clearly communicating status, impact, and needs to both technical and business stakeholders.

Engineering & Automation (The "Dev" Component)

  • Code-Level Troubleshooting: Utilize your development background (e.g., Python, Go, Java, .NET) to read and understand application code, enabling you to pinpoint bugs and collaborate effectively with development teams on fixes.

  • Build Tools, Not Toil: Identify and automate repetitive manual tasks (toil) by building scripts, internal tools, and runbooks.

  • Influence Design: Partner with software engineers to champion "design for production," providing feedback on logging, metrics, and application reliability from the support perspective.

What experience you need

  • Bachelor's degree or equivalent experience

  • 5+ years in a Production Support, Site Reliability Engineering (SRE), or high-stakes DevOps role.

  • Datadog Expertise: Extensive, hands-on experience with the Datadog platform. You must be comfortable building complex dashboards, setting up monitors, and using APM and log analytics for deep-dive troubleshooting.

  • Production Incident Management: Proven track record of leading the response to and resolution of critical incidents in a 24/7, high-availability environment.

  • Development/Scripting: Strong prior development or scripting knowledge. Must be proficient in at least one language like Python, Go, Bash, or PowerShell . The ability to read and debug code in languages like Java, or Node.js is a major plus.

  • Core Tech: Deep understanding of:

    • Cloud Platforms (AWS, Azure, or GCP)

    • Containerization (Kubernetes, Docker)

    • CI/CD Pipelines (Jenkins, GitLab CI, etc.)

  • Mindset: A calm, methodical, and detail-oriented approach to problem-solving, especially under pressure.

What could set you apart

  • Datadog Certification(s).

  • Experience with Infrastructure as Code (Terraform, Ansible).

  • Knowledge of other observability tools (e.g., Prometheus, Grafana, ELK Stack).

  • Experience in database performance tuning (SQL or NoSQL).

© 2025 Qureos. All rights reserved.