Observability expert in Tableau, Grafana, Prometheus, Thanos
We are seeking an AI Observability professional to design, implement, and scale production-grade observability for ML/LLM applications—covering model performance, data quality/drift, safety compliance, cost, latency, reliability (SLOs), and user experience. Evaluate, integrate, or build observability tooling (metrics/logs/traces + model telemetry). This role partners closely with ML engineers, platform teams to enable trustworthy AI systems in production. Build telemetry for models —including latency, token, throughput, error rates, and SLOs for AI endpoints. Create self-service dashboards.
Strong in Data Analysis and Visualization skills. Have experience in Grafana dashboard creation. Experience with Tableau, Grafana, Prometheus, Thanos stack. Have knowledge of SQL, Timeseries data, PromQL, Linux and different types of visualization graphs charts.
Candidate should be strong in Data Analysis and Visualization. Have experience in telemetry dashboard creation.
Experience with Tableau, Grafana, Prometheus, Thanos stack. Have knowledge of SQL, Timeseries data, PromQL, Linux and different types of visualization graphs charts
Salary Range- $100,000-$120,000 a year