About The Job
Mercor
connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include
Benchmark
,
General Catalyst
,
Peter Thiel
,
Adam D'Angelo
,
Larry Summers
, and
Jack Dorsey
.
Position:
AI Safety Experts — English & Punjabi
Type:
Contract
Compensation:
$20–$22/hour
Location:
Remote
Role Responsibilities
-
Red team conversational AI models and agents. Conduct jailbreaks, prompt injections, misuse cases, and bias exploitation.
-
Generate high-quality human data. Annotate failures, classify vulnerabilities, and flag systemic risks.
-
Apply structure using taxonomies, benchmarks, and playbooks to maintain consistent testing.
-
Document reproducibly. Produce reports, datasets, and attack cases for customer action.
Qualifications
Must-Have
-
Fluent in English and Punjabi.
-
Prior experience in red teaming (AI adversarial work, cybersecurity, socio-technical probing).
-
Strong communication skills. Explain risks clearly to technical and non-technical stakeholders.
-
Adaptable. Thrive on moving across projects and customers.
Preferred
-
Experience in Adversarial ML: jailbreak datasets, prompt injection, RLHF/DPO attacks, model extraction.
-
Cybersecurity skills: penetration testing, exploit development, reverse engineering.
-
Socio-technical risk expertise: harassment/disinfo probing, abuse analysis, conversational AI testing.
-
Creative probing abilities: psychology, acting, writing for unconventional adversarial thinking.
Application Process (Takes 20–30 mins to complete)
-
Upload resume
-
AI interview based on your resume
-
Submit form
Resources & Support
PS: Our team reviews applications daily. Please complete your AI interview and application steps to be considered for this opportunity.