Who We Are
Konecta is a global leader in customer management and digital outsourcing, with 120,000 employees across 26 countries. Headquartered in Madrid, we deliver end-to-end customer experience solutions for 500+ clients worldwide.
About the Role:
Our GenAI platform requires comprehensive observability to ensure production reliability, performance optimization, and cost management.
As our
Observability Engineer
, you will design and implement monitoring, alerting, and dashboarding infrastructure that provides deep visibility into platform health, API performance, AI workloads, and operational costs.
Key Responsibilities:
-
Design and implement observability architecture using
Prometheus
and
Grafana
-
Deploy and manage Prometheus stack on GKE with HA and retention strategy
-
Build Grafana dashboards for platform health, APIs, and AI use cases
-
Implement custom metrics for CrewAI agents,
Kong
, and LLM usage
-
Configure
OpenTelemetry
across services
-
Design alerting rules (P0–P3 severity levels) and on-call workflows
-
Build cost dashboards for LLM token usage and infrastructure spend
-
Integrate with GCP Cloud Monitoring & Logging
-
Establish SLI/SLO frameworks aligned with SRE principles
-
Create runbooks for incident response
Required Skills:
-
4+ years in observability / monitoring engineering
-
Strong expertise in Prometheus (PromQL, alerting, recording rules)
-
Advanced Grafana dashboarding and alerting
-
Experience with OpenTelemetry for tracing and metrics
-
Kubernetes monitoring (kube-state-metrics)
-
Solid understanding of SRE principles (SLIs, SLOs, error budgets)
-
Log aggregation tools (Loki, ELK, or similar)
-
Experience designing scalable alerting frameworks
Nice to Have:
-
Experience with
Google Cloud Platform
Cloud Monitoring & Cloud Trace
-
AI/ML observability (model latency, token usage, drift detection)
-
API gateway monitoring (Kong, Envoy, etc.)
-
Long-term Prometheus storage strategies
-
FinOps & cost observability dashboards