Find The RightJob.

PhD Rater - Remote

Seeking experienced researchers and technical experts to support a frontier-model evaluation project focused on agentic workflows. You will design and validate challenging benchmark tasks in data science, machine learning, finance, and coding to help identify reasoning and problem-solving gaps in advanced STEM models. The role involves building real-world tasks with executable tests and analyzing model or agent behavior.

Key Responsibilities

Design challenging, real-world STEM problems
Implement each task within an agentic development environment using Python
Contract and Payment Terms
- You will be engaged as an independent contractor.
- This is a fully remote role that can be completed on your own schedule.
- Projects can be extended, shortened, or concluded early depending on needs and performance.
- Payments are weekly on Stripe or Wise based on services rendered.

Similar jobs

Alibaba Cloud Intelligence-Business Development Manager (Hybrid Cloud)-Saudi Arabia

Alibaba Cloud

Riyadh, Saudi Arabia

about 3 hours ago

UX Researcher

Blue Book Global

Saudi Arabia

10 days ago

Senior HPC AI/ML Support Specialist

KAUST (King Abdullah University of Science and Technology)

Saudi Arabia

10 days ago

Senior Claims Consultant

Marsh McLennan

Riyadh, Saudi Arabia

10 days ago

Research Officer

مجموعة الموسى الصحية

Al Khobar, Saudi Arabia

10 days ago

IT Digital Analyst

Sadara Chemical Company

Al Jubayl, Saudi Arabia

11 days ago

AI Consultant

Devoteam

Riyadh, Saudi Arabia

11 days ago

Term of use Privacy policy

PhD Rater - Remote

Key Responsibilities

Contract and Payment Terms

Alibaba Cloud Intelligence-Business Development Manager (Hybrid Cloud)-Saudi Arabia

UX Researcher

Senior HPC AI/ML Support Specialist

Senior Claims Consultant

Research Officer

IT Digital Analyst

AI Consultant