FIND_THE_RIGHTJOB.
JOB_REQUIREMENTS
Hires in
Not specified
Employment Type
Not specified
Company Location
Not specified
Salary
Not specified
Senior AI R&D Engineer, Agent Orchestration
Role Summary
We are seeking a highly skilled Senior AI R&D Engineer to focus on the cutting-edge field of agent orchestration. This role is responsible for designing, implementing, and deploying sophisticated systems that enable AI agents to execute complex, multi-step tasks autonomously. A deep, practical understanding of Large Language Models (LLMs) and the ability to rigorously evaluate agent system performance are non-negotiable.
What You Will Do
Agent Orchestration Development: Design and build advanced orchestration layers, including planning, memory management, tool-use logic, and iterative reasoning loops for autonomous agents.
Production Deployment: Lead the deployment of LLM-powered features and agent systems into production environments, ensuring scalability, latency targets, and reliability.
Mandatory Evaluation Mastery: Own the definition, implementation, and analysis of agent and LLM evaluation benchmarks (reference tests, heuristics, model-graded evaluation). This includes metrics for helpfulness, accuracy, hallucination, and safety.
Framework Implementation: Leverage and contribute to modern agentic frameworks (e.g., LangChain, AutoGen, CrewAI, custom internal frameworks) to build out capabilities.
Research to Production: Translate the latest advancements in LLM and agent research into robust, production-ready features, collaborating closely with research scientists.
A/B Testing: Design and execute rigorous A/B tests on prompts, system settings, and model versions, making data-driven recommendations for rollouts or rollbacks.
Key Requirements
10+ years of professional experience in machine learning engineering, NLP, or AI R&D.
Mastery of LLM/Agent Evaluation: Demonstrated expertise in setting up, running, and analyzing evaluation systems for LLMs and agentic workflows.
Production Experience (Mandatory): Extensive, hands-on experience putting LLMs and at least one major agentic framework (or equivalent custom architecture) into high-volume production.
Strong software engineering foundation, primarily in Python, with experience writing clean, maintainable, and efficient code.
Experience in the advanced mechanisms and libraries around python, pytorch, langchain, llama index, deepeval, scikit learn.
Deep understanding of LLM mechanisms (prompt engineering, fine-tuning, RAG, function/tool calling).
Experience with observability tools (logging, tracing, monitoring) for real-time model performance diagnostics.
Bonus Points
Background in academic research (publications) related to AI, NLP, or reinforcement learning.
Experience optimizing inference pipelines for performance and cost.
Familiarity with safety and guardrail systems for generative AI.
© 2025 Qureos. All rights reserved.