Remote
Contract
You will review AI-generated Hebrew and English responses and/or generate high-quality bilingual training content, evaluating reasoning quality and step-by-step problem-solving while providing expert feedback that helps models produce answers that are accurate, logical, and clearly explained. You will assess solutions for accuracy, clarity, and adherence to the prompt; identify errors in methodology or conceptual understanding; fact-check information; write high-quality explanations and model solutions that demonstrate correct methods; and rate and compare multiple AI responses based on correctness and reasoning quality. This is an hourly paid, fully remote contractor role with flexible work-from-anywhere hours. This job is with, a fast-growing AI Data Services company providing training data for many of the world’s largest AI companies and foundation model labs—your work directly helps improve the world’s premier AI models.
- Bachelor’s degree (or higher) in Linguistics, Translation, Hebrew Language, Communications, Journalism, or a related field.
- Native or near-native Hebrew proficiency with strong writing and editing skills across formal and informal registers.
- Minimum C1 English proficiency (reading and writing) for bilingual evaluation and instruction adherence.
- 3+ years of professional experience in translation, localization, editorial QA, content quality, or linguistic review (or equivalent).
- Excellent attention to detail and ability to apply detailed rubrics consistently in high-volume, hourly contractor work.
- Strong fact-checking habits and comfort validating claims using reputable sources when required by task guidelines.
- Ability to identify reasoning gaps, methodological errors, and unclear explanations—even when language is fluent.
- Comfort working independently in a remote, asynchronous environment with reliable availability and communication.
- Prior experience with AI data training, annotation, or evaluation workflows is strongly preferred.
- Familiarity with Israeli cultural context and terminology norms across common domains (news, education, consumer, tech) is preferred.
- Develop AI Training Content: Create detailed prompts in various topics and responses to guide AI learning, ensuring the models reflect a comprehensive understanding of diverse subjects (Hebrew/English; requires C1+ English).
- Optimize AI Performance: Evaluate and rank AI responses to enhance the model's accuracy, fluency, and contextual relevance (Hebrew/English; requires C1+ English).
- Ensure Model Integrity: Test AI models for potential inaccuracies or biases, validating their reliability across use cases (Hebrew/English; requires C1+ English).