Find The RightJob.
About Polymath
Polymath is an applied research lab focused on advancing long-horizon agent capabilities through reinforcement learning. We design and scale simulation environments where agents learn to operate safely and autonomously. We work with the world’s leading model labs to push the frontier of agent capabilities. Polymath is backed by Base10, Founders Future, Y Combinator, and other incredible investors & angels. We've raised an $8M seed, and are actively growing out the team.
About the role
We’re hiring Software Engineers to build the simulation environments, tasks, and verifiers that challenge frontier models. You’ll help create the training and evaluation grounds that make it possible to measure and improve autonomous agents on realistic, challenging work. This is a contract-based role with the opportunity to transition into a full-time position.
Examples of projects you could work on include:
Building diverse, high-fidelity environments that test agents in realistic settings
Designing complex tasks that require long-horizon reasoning, tool use, and adaptation
Developing robust verifiers that reliably measure agent performance
Improving the quality, difficulty, and realism of our evaluation environments
Improving infrastructure and tooling to run, debug, and improve Polymath’s environment simulation platform
Working closely with the research team to identify failure modes and turn them into new tasks and benchmarks
Have strong engineering fundamentals
Enjoy building from first principles and solving open-ended technical problems
Have high agency and a strong bias toward shipping
Have a high quality bar and care about building robust systems
Culture:
Polymath is a team of researchers, engineers, and operators focused on advancing the frontier of safe, superintelligent AI agents.
We have a flat organizational structure. We believe that people do their best work when they’re self-motivated and driven by a desire to learn, contribute to the team’s goals, and advance scientific progress.
We’re looking for folks who ship fast, set high standards for themselves, and are great team players.
Similar jobs
GoodRx
San Francisco, United States
about 10 hours ago
Twitch
San Francisco, United States
about 10 hours ago
Plaid
San Francisco, United States
7 days ago
Handshake
San Francisco, United States
7 days ago
Amazon.com
San Francisco, United States
7 days ago
OpenAI
San Francisco, United States
8 days ago
Pylon
San Francisco, United States
8 days ago
© 2026 Qureos. All rights reserved.