Qureos

FIND_THE_RIGHTJOB.

Contract Data/ML Engineer-Scoring Reliability & Candidate Archetypes (Part-time)

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

Job: Contract Data/ML Engineer — Scoring Reliability & Candidate Archetypes - ASAP
Job Type: Part-time (for a 100-hour project)Job Presence: Remote (optional for Onsite and Hybrid in Vietnam)Candidate Location: Vietnam, IndiaJoining Date: ASAP as we're hiring this role urgently
Summary
Own the end-to-end implementation of two analytics features in Qode’s multi-agent assessment stack: (1) bootstrap confidence intervals (CIs) for per-question scores to communicate stability/disagreement across evaluators, and (2) candidate archetype discovery via clustering to surface talent patterns beyond raw scores. You’ll ship data plumbing, models, integrations, and lightweight reporting.
What you’ll do
  • Data foundations: ensure per-candidate, per-question, per-agent criterion scores are structured and queryable; add/modify tables and JSON schemas as needed.
  • Bootstrap CIs: implement agent-level resampling, compute CI-90/CI-95, derive stability labels (high/medium/low), and persist alongside normalized scores; batch backfill existing records.
  • Archetypes: build standardized candidate feature vectors (per-question and/or per-criterion), run clustering (K-means/GMM/hierarchical), evaluate (e.g., silhouette), and generate human-readable labels from centroids and summaries.
  • Integrations: expose CI fields and cluster IDs/labels via API and internal dashboards; add basic charts/UX to surface stability and “candidate type.”
  • Reliability & performance: write unit/integration tests, guardrails (min N agents), and ensure pipeline runtime stays within agreed budgets.
  • Docs & handoff: clear README/runbooks covering data contracts, thresholds, and ops.
Must-have skills and qualifications
  • 3-5 years of experience in a relevant role
  • Python (pandas, NumPy, scikit-learn), SQL, DB migrations (e.g., Postgres).
  • Statistical resampling (bootstrap), clustering, model selection/validation.
  • Data engineering for batch jobs/backfills; API integration.
  • Pragmatic product sense for labeling clusters and communicating uncertainty.
Nice-to-haves
  • Airflow/dbt/Prefect; Grafana/Metabase; experience with multi-agent/LLM evaluation pipelines; cloud (GCP/AWS/Azure); Docker/Kubernetes.
Deliverables & acceptance criteria
  • CI service/module + persisted mean, ci_low, ci_high, stability_label for 100% of scored candidate-question rows with ≥N agents; reproducible backfill completed.
  • Clustering job that assigns cluster_id and cluster_label to each candidate; labels documented with centroid profiles and example candidates.
  • API fields and minimal dashboard tiles (score±CI, stability badge; “Candidate Type” with top strengths/weaknesses).
  • Tests (unit + E2E), monitoring hooks, and runbooks.

Similar jobs

No similar jobs found

© 2025 Qureos. All rights reserved.