About Us
Trailhead Biosystems merges developmental biology with computerized experimental design to develop novel iPSC research products. Trailhead Biosystems is the pioneer of High-Dimensional Design-of-Experiment (HD-DoE®), a powerful platform that differentiates human induced pluripotent stem cells (iPSCs) into virtually any desired cell type with unparalleled precision and efficiency. We are moving beyond traditional trial-and-error methods by implementing Quality by Design (QbD) principles and utilizing DoE combined with Multivariate Data Analysis (MVDA).
We are seeking a forward-thinking AI Engineer to join our R&D team and help build the intelligent systems that drive our cellular engineering workflows.
Why Join Us? As an AI Engineer at Trailhead, you will not just be managing data; you will be part of a team that is revolutionizing how we explore cell biology, using intelligent systems to direct human iPSC differentiation with unprecedented reproducibility and accuracy
- Generate Computer-Based Experimental Designs: Drive advanced cell culture workflows through AI-supported design, drastically reducing the time and number of experiments needed for optimization.
- Model Biological Complexity: Mathematically model the effector/response space to perform virtual experiments, allowing us to predict how specific additives will create desirable cell culture fates before physical testing.
- Decipher Gene Expression: Gain deep knowledge into how culture conditions affect gene expression, addressing complex optimization challenges regarding cell purity and potency.
- Master Manufacturing Variance: Use data-driven insights to understand and control variance in cell manufacturing methods, ensuring consistent, high-quality outcomes.
The Role
As an AI Engineer, you will be instrumental in deploying Agentic AI frameworks that can reason, plan, and execute complex scientific tasks. You will build autonomous multi-agent systems capable of analyzing proprietary biological data, managing knowledge graphs, and optimizing experiments. Working within a robust Azure cloud architecture, you will turn research concepts into production-grade tools that empower our scientists to work faster and smarter.
Responsibilities
- Cross-Functional Collaboration: Partner closely with research scientists, biologists, and product managers to translate complex scientific challenges into actionable technical requirements and scalable AI solutions.
- Develop Agentic AI Solutions: Design and build autonomous AI agents that leverage our proprietary biological datasets to automate complex analytical workflows, from experimental design to data interpretation.
- Fine-Tune Domain-Specific Models: Fine-tune open-weights models (e.g., Llama, Mistral) on our proprietary biological datasets to create specialized models capable of understanding cell differentiation pathways and effector-response relationships.
- Proprietary Data Architecture & UI: Architect, build and maintain a secure proprietary database and automated data ingestion pipelines (ETL/ELT), while developing intuitive user-friendly interfaces and APIs that help scientists to easily query, visualize, and interact with our proprietary data.
- Build & Maintain Knowledge Graphs: Integrate GenAI models with graph databases (e.g., Neo4j) to map complex relationships between cell lines, reagents, and genetic markers, enabling the AI to perform deep reasoning over structured biological data.
- Orchestrate Multi-Agent Systems: Design and implement sophisticated multi-model, multi-agent architectures using state-of-the-art frameworks such as LangChain, LangGraph, or CrewAI. Your systems will coordinate multiple specialized agents to solve multi-step problems collaboratively.
- Production Deployment on Azure: Lead the deployment of AI solutions into a secure Azure cloud environment. You will be responsible for setting up monitoring, ensuring observability (logging, tracing), and managing the lifecycle of models using MLOps best practices (e.g., Azure Machine Learning, MLflow).
- RAG & Information Retrieval: Implement Retrieval-Augmented Generation (RAG) pipelines to ground Large Language Models (LLMs) in our internal scientific corpus, ensuring high accuracy and reduced hallucinations in technical outputs.
- Evaluation & Benchmarking: Design, implement, and own automated LLM evaluation frameworks and benchmarking suites to quantify model accuracy, reliability, and business impact.
- Guardrails & Quality Assurance & Policy Making: Establish and maintain robust guardrails, observability, and quality gates to mitigate hallucinations, ensure AI safety, and prevent jailbreaking. Help build policies, governance, and procedures.
- Documentation: Create and maintain clear technical documentation, architecture diagrams, and runbooks to ensure system reproducibility and facilitate team knowledge sharing.
- Ownership: Own end-to-end model performance and reliability across different systems, including continuous monitoring, logging, and regression testing.
Qualifications & Requirements
Education:
- Master’s degree in Computer Science, Artificial Intelligence, Computational Biology or a related quantitative field.
Experience:
- 3-5 years of professional experience in software engineering with a dedicated focus on AI/ML.
- 2+ years of experience developing agentic AI systems and RAG pipelines using large language models, bringing prototypes through to scalable production.
Technical Skills:
- Agentic Frameworks: Strong proficiency with orchestration frameworks like LangChain, LangGraph, CrewAI, Microsoft Semantic Kernel, or AutoGen. Experience building agents that utilize tool calling and function execution is essential.
- Programming: Expert-level fluency in Python and familiarity with modern software development practices (CI/CD, Git, containerization with Docker/Kubernetes).
- Cloud Architecture: Demonstrated experience deploying scalable applications on Microsoft Azure. Familiarity with Azure OpenAI Service, Azure AI Search, and Azure Functions is highly preferred.
- Data Engineering & Knowledge Graphs: Experience designing schemas and querying graph databases (e.g., Neo4j, RDF/SPARQL) and working with vector databases (e.g., Pinecone, Weaviate, Milvus).
- MLOps: Practical knowledge of model monitoring, evaluation frameworks (e.g., Ragas, LangSmith), and deploying models via REST APIs (FastAPI).
Soft Skills:
- Ability to translate complex technical concepts for cross-functional teams, including biologists and lab scientists.
- Strong problem-solving skills with a "builder" mindset, you enjoy taking projects from prototype to production.
- Collaborative spirit with a desire to work in a fast-paced, research-driven environment.
Nice to Have:
- Experience in biotechnology, pharmaceutical, or healthcare sectors.
- Familiarity with biological data standards or bioinformatics workflows.
Benefits:
- 401(k)
- 401(k) matching
- Dental insurance
- Health insurance
- Life insurance
- Paid time off
- Parental leave
- Vision insurance
Work Location: In person