Scope:
-
Receives work assignments through the ticketing system or from Lead Administrators or management.
-
Provide Level 3/4 support for all observability capabilities across cloud, on-premises, and hybrid environments, with a primary focus on the Elastic Observability Platform.
-
Design, implement, and maintain end-to-end monitoring, logging, tracing, and alerting systems using the Elastic Stack (Elasticsearch, Kibana, Elastic Agents/Beats, Logstash, APM).
-
Collaborate with internal and external stakeholders to ensure reliable telemetry, high-quality alerting, and seamless integrations with infrastructure, networks, SaaS platforms, and cloud services.
-
Ensure security, scalability, performance, and availability of observability systems and enterprise monitoring pipelines.
-
Support monitoring maturity initiatives by improving instrumentation, reducing alert noise, and delivering actionable operational insights.
-
Develop and maintain synthetic monitoring, user-experience metrics, health checks, and alerting for key SaaS applications including Workday, Salesforce, and ServiceNow.
-
Support incident management by providing real-time dashboards, correlation insights, root-cause analysis data, and overall observability-driven improvements to MTTD and MTTR.
-
Provide documentation, observability standards, and runbooks while guiding internal teams on dashboards, alert tuning, and best-practice monitoring patterns.
-
Work closely with Cloud, Infrastructure, Network, Application, and Security teams to ensure cohesive telemetry coverage and continuous improvement of monitoring across the enterprise.
-
Required to provide on-call support during off-duty hours on weekdays, weekends, and holidays on a rotating basis.
Our Current Technical Environment:
-
Tools:Elastic Stack (Elasticsearch, Kibana, APM, Logstash, Beats / Elastic Agents), ServiceNow, Azure Monitor, API integrations.
-
Platforms:Azure cloud services, VMware, Linux/Windows servers, enterprise networking, Kubernetes/Containers.
-
SaaS:Workday, Salesforce, ServiceNow, Jira, Confluence, Microsoft 365 (Teams, Exchange, SharePoint).
-
Programming & Scripting:PowerShell, Python, Bash, API integrations.
-
Cloud Architecture:Azure (ARM Templates, Terraform), container environments, hybrid cloud.
-
Monitoring Concepts:Logs, metrics, traces, synthetics, anomaly detection, machine-learning–based alerting, RCA dashboards, ILM policies.
What You’ll Do:
-
Receive work assignments from the ticketing system or leadership and execute observability engineering requests.
-
Architect, deploy, and manage Elastic-based observability solutions across hybrid cloud and on-prem environments.
-
Configure dashboards, visualizations, alert rules, anomaly detection jobs, and ML-based detections within Elastic.
-
Optimize ingestion pipelines, index lifecycle management (ILM), retention policies, and search/query performance.
-
Integrate Elastic Observability with Azure, VMware, network devices, Microsoft 365, ServiceNow, and API-based data sources.
-
Implement logs, metrics, and traces instrumentation across infrastructure, cloud workloads, network systems, and containers.
-
Build synthetic monitoring checks, baseline performance metrics, and user-experience monitoring for critical SaaS applications.
-
Design and refine alerting strategies to reduce false positives and improve detection precision.
-
Support incident response by providing real-time monitoring views, RCA data, correlation insights, and post-incident analytics.
-
Maintain dashboards, documentation, and runbooks for internal teams.
-
Train teams on Elastic usage, dashboard interpretation, and alert tuning for operational excellence.
-
Collaborate with cross-functional teams (Cloud, Infrastructure, Enterprise Apps, Security) to define and enforce monitoring standards.
-
Participate in major initiatives to embed observability controls into design, deployment, and operations.
-
Ensure observability systems meet enterprise requirements for performance, security, scalability, and compliance.
-
Provide on-call support for critical incidents on a scheduled rotation.
What We Are Looking For:
-
Bachelor’s degree in Computer Science, Engineering, MIS, or equivalent work experience.
-
5–8 years of experience in observability engineering, SRE, monitoring, or IT operations.
-
Proven hands-on experience with the Elastic Stack (Elasticsearch, Kibana, Beats/Elastic Agent, Logstash, APM).
-
Experience monitoring Azure and/or AWS cloud services.
-
Strong knowledge of instrumentation for logs, metrics, and traces across hybrid environments.
-
Experience with monitoring data center systems, networks, VMware, Microsoft 365, and Linux/Windows workloads.
-
Hands-on experience configuring monitoring for SaaS apps such as Workday, Salesforce, Jira, Confluence and ServiceNow.
-
Strong scripting skills in PowerShell, Python, Bash, or similar.
-
Experience with API integrations, automation, data ingestion pipelines, and monitoring agents.
-
Ability to work under pressure and meet deadlines in a fast-paced environment.
-
Ability to act independently, prioritize effectively, and drive observability improvements.
-
Excellent analytical and troubleshooting skills with a focus on reliability and continuous improvement.
-
Strong communication skills and ability to collaborate with engineers, architects, and stakeholders.
-
Understanding of monitoring governance, logging standards, and observability best practices.
-
Familiarity with Grafana, Prometheus, Splunk, AppDynamics, or Dynatrace (preferred).
-
Knowledge of Terraform, Ansible, or Infrastructure-as-Code (preferred).
-
Understanding of ITIL processes and incident/problem management methodologies.
-
Experience working with hybrid cloud technologies, containers, and automation pipelines.
Our Values
If you want to know the heart of a company, take a look at their values. Ours unite us. They are what drive our success – and the success of our customers. Does your heart beat like ours? Find out here:
Core Values
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status.