Agentic AI Developer — Claude & Anthropic Stack

Location: Lahore

Experience: 3-4 years

Open Positions: 1

Role Summary

We are hiring an Agentic AI Developer with deep, hands-on expertise in the Anthropic Claude ecosystem to design, build, and ship production AI agents and deploy automations across every function of the company — Sales, Marketing, Delivery, HR, Finance, Admin, and Legal. This is a builder role. You will move from problem to working agent to organization-wide rollout, owning the full lifecycle: discovery, architecture, prompt and tool design, evaluation, deployment, monitoring, and change management.

We are specifically standardizing on Anthropic — Claude API, Claude Agent SDK, Claude Code, Claude Skills, MCP, and Cowork. We need someone who lives in this stack daily and knows it is cold, not someone whose Claude experience is one chapter inside a multi-model background.

Non-Negotiable Requirements

Applications that do not clearly demonstrate all four of the following will be rejected at screening:

10+ production agents shipped. Built and shipped at least 10 end-to-end agents in production — not prototypes, not demos, not coursework. You must be able to walk through each one: the problem, the architecture, the tools, the eval approach, the failure modes you handled, and the business outcome.
Claude-native expertise. Deep, daily, hands-on Anthropic Claude experience. You should be able to discuss Opus vs Sonnet vs Haiku trade-offs, prompt caching mechanics, tool use loop design, sub-agents, computer use, extended thinking, and Claude-specific prompting patterns without hesitation.
MCP fluency. Production experience with Model Context Protocol (MCP) — you have built MCP servers, integrated MCP clients, and understand the protocol well enough to debug it at the JSON-RPC level.
Cowork experience. Demonstrated experience using Cowork (Anthropic's desktop automation tool for non-developer file and task workflows) or comparable Anthropic-native deployment surfaces. You can articulate where Cowork is the right answer versus a coded agent versus Claude Code.

What You Will Own

Agent design and development. Architect single-agent and multi-agent systems on Claude. Define roles, tool interfaces, memory, planning loops, sub-agent handoffs, and human-in-the-loop checkpoints.
MCP tool and integration layer. Build and maintain MCP servers that expose internal systems (CRM, HRIS, ticketing, document stores, databases, email, calendar) to Claude agents in a secure, auditable way.
Organization-wide automation rollout. Identify high-ROI automation candidates across departments, ship pilots within weeks, and scale validated agents into daily operations using the right Anthropic surface — API, Agent SDK, Claude Code, Skills, or Cowork.
Evaluation and reliability. Define golden datasets, eval harnesses, regression tests, guardrails. Track agent accuracy, task completion rate, latency, and token cost per task.
Productionization. Deploy with proper logging, observability, error handling, retries, secrets management, prompt caching, and batching. Own uptime and incident response.
Adoption and enablement. Run discovery workshops with department heads, document SOPs, train users, and drive adoption metrics. You are responsible for usage, not just delivery.
Cost and governance. Manage token spend, model selection between Opus/Sonnet/Haiku, prompt caching strategy, data privacy boundaries, and compliance with internal AI usage policy.

Engineering Foundations

3+ years of professional software engineering, with at least 18 months on production LLM applications or agents.
Strong Python (primary) for agent code and MCP servers. TypeScript/Node for tool servers and integration glue.
Comfortable with FastAPI, async patterns, retries, circuit breakers, and structured logging.
At least one cloud (AWS, GCP, or Azure), containerization (Docker), and a workflow tool (Temporal, Prefect, n8n, or equivalent).
Git, code review, CI/CD discipline. You write tests for agent code, including eval-as-test patterns.
RAG production experience: vector databases (pgvector, Pinecone, Qdrant, or Weaviate), chunking strategy, hybrid search, reranking, and grounding evaluation.

Portfolio Expectations — The 10 Agents

You must submit a portfolio document covering at least 10 production agents you personally built and shipped. For each agent, include:

Problem statement and the department/user it served.
Architecture (single-agent, orchestrator-worker, sub-agents, etc.) and why.
Anthropic surface used (API, Agent SDK, Claude Code, Skills, Cowork, MCP servers).
Tools and integrations wired up.
Eval approach — what you measured, on what dataset, and the result.
Failure modes you debugged and how you fixed them.
Production outcome: hours saved, cycle time reduced, error rate cut, revenue impact, or adoption metric.

Demos, repo links, architecture diagrams, or recorded walkthroughs are all welcome. Generic CVs without this portfolio will not be reviewed.

Work Location: In person

Similar jobs

No similar jobs found

Term of use Privacy policy