Product Manager: Data and Evaluation

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

About AI71:

we build AI products and deliver AI custom solutions for privacy and data sovereignty sensitive corporations and government entities
we're a product company with a strong advisory/forward deployed team
Our flagship product is Ask - a customizable AI enterprise platform for knowledge workers' productivity. We're also building vertical AI products in construction, health, agriculture, and others
We're a young (2023) but fast growing (140 people) company
Our team comes from tier 1 product companies (DeepMind, Google, Amazon, Apple) and tier 1 strategy consulting firms (McKinsey QuantumBlack, BCG X)
we're based in Abu Dhabi and have strong ties to the region - but we serve international customers

The Role: Squad PM - Data & ML Evaluation

You'll be the product owner for all things product data and ML evaluation for Ask. You'll define the data strategy, instrumentation, experimentation, and model evaluation loops that drive product decisions and model quality. You'll lead a data and eval squad, you'll partner closely with PMs, data scientists, ML engineers, and platform teams to ensure we ship measurable, safe, and delightful AI features. You'll also help ground our pricing strategy in data.

What You'll Do

Own the product data stack end-to-end across squads: telemetry, event schemas, tracking plans, data quality, warehousing, and accessibility of metrics both for our on-premise and our SaaS offerings.
Define the product metrics framework (north stars, input metrics, leading indicators) and help stakeholders build reliable dashboards.

Build/own ML evaluation loops : curate test sets, define KPIs (quality, safety, bias, latency, cost), run offline/online evals, help evaluate agents, tools, finetunes, and retrieval changes; maintain an eval pipeline that's reproducible and automated in CI
Lead experimentation : design and analyze A/B tests - help create a framework for testing our customers' agents and our "hyper-parameters.

Ground pricing strategy in data : analyze feature/cost relationships, develop value-based pricing models, and create metrics that link usage patterns to pricing tiers.
Set data standards : event naming, governance, PII handling, retention, and privacy-by-design in collaboration with Security/Legal teams.

What You've Done (Requirements)

5–8+ years in product roles with a strong engineering or data background (e.g., software, data engineering, ML engineering, analytics engineering) in tech product companies
Proven ownership of instrumentation, analytics, and experimentation for a product surface or platform
Deep comfort with SQL and experimentation stats (power, MDE, CUPED, guardrail metrics); able to sanity-check analyses and run your own when needed
Practical experience working with ML-powered products and their evaluation (offline test sets, online metrics, drift/monitoring)
Strong product sense + stakeholder management; able to drive clarity across PM, DS, Eng, and Execs
Comfortable using AI tools to prototype analyses or small utilities (e.g., generate code for ETL, dashboards, or validators)

Nice-to-Haves

Hands-on with Amplitude (or similar product analytics) and MLflow (or similar experiment tracking)
Experience with RAG/LMM evals, safety/guardrail metrics, and red-teaming
Familiarity with data modeling (e.g., dbt), feature stores, and observability (e.g., Monte Carlo, Great Expectations)
Experience in privacy-sensitive / regulated environments

How You'll Work

Embedded in a "Data and Eval" within Ask, collaborating with the rest of the Ask team (~50 engineers, ~4 PMs, 1 PMM, and ~3 Product Designers)
Close partnership with customers, Security/Legal (for data governance), Strategy and Finance teams

Similar jobs