FIND_THE_RIGHTJOB.
JOB_REQUIREMENTS
Hires in
Not specified
Employment Type
Not specified
Company Location
Not specified
Salary
Not specified
Design, build, and operate a production server side AI assistant that:
Answers questions grounded in user scoped data (docs, tables) with citations where applicable
Performs actions by calling internal/external APIs securely on the user s behalf
Supports low latency, real time voice chat (streaming STT/TTS + incremental LLM responses)
Implement the tool/agent layer:
Structured tool calling (JSON schema based) to integrate business services
Model Context Protocol (MCP) servers/clients where appropriate for tool discovery and execution
Architect retrieval augmented generation (RAG):
Ingestion for documents and tables, parsing, chunking, embeddings, metadata, and indexing
Hybrid retrieval (sparse+dense), query rewriting, and answer attribution
Deliver performant, cost e cient inference on open source models:
Model selection/routing; context management; caching/batching; streaming token delivery
GPU utilization and serving via vLLM/TGI/llama.cpp/Ollama or similar
Build resilient APIs and real time integrations:
WebSockets/WebRTC/gRPC for streaming voice; REST/GraphQL for control and orchestration
Productionize and operate on server/on prem:
Containerize with Docker; automate CI/CD; implement logs/metrics/traces (OpenTelemetry)
Evals, A/B tests, safety/guardrails, and human in the loop feedback
Similar jobs
talabat
Dubai, United Arab Emirates
7 days ago
OneBullEx
Dubai, United Arab Emirates
7 days ago
OneBullEx
Dubai, United Arab Emirates
7 days ago
fospe
Dubai, United Arab Emirates
7 days ago
Careem
Dubai, United Arab Emirates
7 days ago
Misrai Technology FZC
Dubai, United Arab Emirates
7 days ago
© 2025 Qureos. All rights reserved.