We are seeking a highly skilled Senior DevOps Engineer with experience in architecting and deploying Agentic AI systems. The ideal candidate will have a strong background in DevOps, cloud infrastructure, and modern AI integrations, including LLM-based systems and autonomous agent development.
Requirements
- Architect AI-Driven Systems: Design and implement modular architectures that integrate large language models (LLMs) and support multi-step reasoning, retrieval-augmented generation (RAG), and external tool usage.
- AI Model Integration: Evaluate, select, and integrate suitable AI models (e.g., GPT-5, custom LLMs) into scalable production environments.
- Agent Development: Build and deploy intelligent agents capable of research, summarization, task automation, and action execution.
- Infrastructure & Scalability: Manage containerized environments using Docker and Kubernetes, ensuring high availability, performance, and security.
- Responsible AI Practices: Establish and maintain content filtering, audit logging, and data protection mechanisms to ensure ethical AI operations.
- Stakeholder Collaboration: Work closely with cross-functional teams to translate business needs into scalable technical AI solutions.
- Continuous Optimization: Monitor agent performance, analyze errors, refine prompts, and optimize system behavior for accuracy and efficiency.
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in DevOps or cloud infrastructure engineering.
- Strong hands-on experience with Docker, Kubernetes, and CI/CD pipelines.
- Experience with LLMs, AI model deployment, or MLOps workflows is a strong plus.
- Knowledge of cloud platforms (AWS, Azure, or GCP) and infrastructure as code (Terraform, Ansible, etc.).
- Excellent problem-solving skills and the ability to work in a fast-paced, AI-focused environment.
- Experience with Python-based AI frameworks and APIs.
- Familiarity with retrieval systems, vector databases, or RAG pipelines.
- Understanding of data governance, compliance, and responsible AI principles.