Find The RightJob.

Data Scientist – AI Training & Evaluation

About The Role

AI is only as good as the experts who train it. We're looking for data scientists to help evaluate, refine, and improve next-generation AI systems — bringing your quantitative expertise directly to bear on how the world's most advanced models reason, analyze, and communicate.

This is a fully remote, flexible contract role. You set your hours and work at your own pace, contributing to projects that sit at the frontier of applied AI research.

Organization: Alignerr
Type: Hourly Contract
Location: Remote
Commitment: 10–40 hours/week

What You'll Do

Evaluate AI model outputs for statistical soundness, reasoning quality, and analytical accuracy
Design and apply data-driven evaluation criteria and scoring rubrics
Analyze patterns in AI-generated responses to surface systematic errors or biases
Create high-quality training data — including prompts, worked solutions, and expert annotations — across data science and ML domains
Review AI-generated code, visualizations, and statistical analyses for correctness and best practices
Provide structured, detailed feedback that directly improves model performance
Work independently and asynchronously on your own schedule

Who You Are

Degree in Data Science, Statistics, Computer Science, Mathematics, or a related quantitative field (MS or PhD preferred)
Strong foundation in statistics, probability, and machine learning concepts
Proficient in Python, R, SQL, or similar data analysis tools
Experienced with data wrangling, exploratory data analysis, and model evaluation
Sharp analytical thinker with excellent attention to detail
Clear written communicator — able to explain complex technical concepts concisely
Self-motivated and comfortable working independently in an async environment

Nice to Have

Experience with deep learning frameworks such as PyTorch or TensorFlow
Familiarity with NLP, large language models, or AI evaluation workflows
Published research or hands-on industry experience in applied machine learning
Background in A/B testing, causal inference, or experimental design

Why Join Us

Work on cutting-edge AI projects alongside top research labs and AI teams globally
Get rare, inside exposure to how state-of-the-art LLMs are trained and evaluated
Fully remote and async — work when and where it suits you
Complete autonomy over your schedule and workload (10–40 hrs/week)
Join a growing community of expert contributors who are actively shaping the future of AI
Potential for ongoing work and long-term contract extension

Similar jobs

No similar jobs found

Term of use Privacy policy