About THIA
THIA is transforming how small and medium enterprises build internal applications and automate business processes. Our AI-powered platform enables business experts to create custom applications using natural language, eliminating the need for expensive development teams. We're well-funded, generating revenue, and solving real problems for companies that need more than off-the-shelf software.
The Role
This is a hands-on ML role where you'll own modeling work end to end - fine-tuning the language models that power our platform, building the eval frameworks that tell us whether they're getting better, and shipping the result into production. You'll work closely with a small, senior team and have direct influence over what we train, how we measure quality, and how the model evolves alongside the product.
We move fast, keep our codebase clean, and take tech debt seriously.
What You'll Do
Modeling & Evaluation
- Fine-tune transformer-based models (instruction tuning, LoRA/PEFT, RLHF/DPO, distillation) and ship the result through eval into production
- Design and curate evaluation datasets that meaningfully reflect real customer behavior
- Build LLM-as-judge pipelines and align them against human judgment
- Run experiments end to end: hypothesis → controlled comparison → calibrated metrics → decision
- Own model-quality monitoring in production and close the loop back into training data
Collaboration
- Work autonomously while staying tightly coordinated with a small, async-first team
- Partner with the platform team on model serving, observability, and the multi-tenant rollout
- Contribute to architectural decisions and internal documentation
What We're Looking For
Must-Haves
- Strong Python; working expertise with PyTorch and the HuggingFace transformers ecosystem
- Hands-on fine-tuning experience on transformer-based models — at least one shipped or rigorously evaluated result
- Experimental rigor: hypothesis design, controlled comparisons, calibrated metrics
- Has carried at least one ML feature through the full lifecycle (data → train → eval → deploy → monitor)
- Cloud ML lifecycle experience (GCP/AWS/Azure)
Strongly Preferred
- LLM-as-judge eval pipelines and human-judgment alignment
- RAG or retrieval system design experience
How We Evaluate
We hire for skill and potential, however acquired. If you can do the work, we want to hear from you.
A Note on AI
We actively encourage using AI tools to move faster. Real-world experience is still required - to direct AI effectively, catch what it misses, and spot security issues before they reach production.
Our Stack
Python · PyTorch · HuggingFace · Modal · GCP · PostgreSQL / SQLite · Qdrant · Redis · GitLab CI/CD · Datadog
What You Gain
- Ownership - end-to-end accountability for the models that power a growing AI company
- Impact - direct collaboration with leadership and real influence on technical direction
- Growth - clear path to a lead role as the team expands
- Equity - early-stage equity at an AI startup
- Flexibility - fully remote with flexible hours
- Quality - a clean codebase and a team that takes tech debt seriously
Pay: $110,000.00 - $160,000.00 per year
Benefits:
- 401(k)
- 401(k) matching
- Dental insurance
- Health insurance
- Paid time off
- Vision insurance
Work Location: Remote