Find The RightJob.

AI Systems Architect - LLM & Vector Infrastructure

We are seeking a senior AI Systems Architect to design and implement AI-native application cores where Large Language Models (LLMs), vector databases, retrieval systems, and agent frameworks form the primary computational layer of our web and mobile applications.

This role is responsible for architecting scalable AI pipelines, retrieval-augmented generation (RAG) systems, memory architectures, AI agents, and orchestration workflows integrated with our development stack (Web, Mobile, n8n automation, and AI services).

The ideal candidate understands that AI is not a feature, it is the operating system of the product.

Key Responsibilities

1. AI Core Architecture Design

Design AI-first system architecture for web and mobile applications
Architect RAG pipelines using vector databases
Define long-term memory, short-term memory, and contextual state systems
Implement multi-agent AI systems
Design AI orchestration layers

2. Vector Database & Embedding Systems

Select and implement vector databases such as:
- Pinecone
- Weaviate
- Qdrant
- Milvus
- Supabase (pgvector)
Optimize embedding strategies
Implement hybrid search (semantic + keyword)
Design scalable indexing pipelines

3. LLM Integration & Optimization

Work with models such as:
- OpenAI APIs
- Anthropic
- Meta (LLaMA)
- DeepSeek
- Alibaba (Qwen)
Implement structured output pipelines
Design evaluation and prompt testing frameworks
Optimize cost-performance ratio

4. AI Agent Systems & Orchestration

Build autonomous AI agents
Design tool-calling systems
Integrate with:
- n8n
- LangGraph / LangChain style agent flows
Implement memory-aware agents

5. Production AI Engineering

Build monitoring systems for hallucination detection
Design guardrails and validation layers
Implement evaluation datasets and benchmarking
Ensure security of AI pipelines
Build scalable infrastructure (Docker, Kubernetes, GPU optimization)

Requirements

Technical Expertise

5+ years software engineering experience
2+ years building production AI systems
Deep knowledge of:
- Vector embeddings & similarity search
- RAG architectures
- Tokenization and context window optimization
- Fine-tuning & LoRA concepts
- Prompt evaluation frameworks
Experience with Python (mandatory)
Experience with FastAPI / backend services
Experience designing scalable APIs

Architecture Experience

Designing distributed systems
Microservices & event-driven architecture
Experience with PostgreSQL + pgvector
Experience deploying LLM systems in production

Similar jobs

No similar jobs found

Term of use Privacy policy