Find The RightJob.
Build Low Latency Conversational AI Systems
We are building real-time conversational AI systems built on top of large language models, speech AI, and agentic workflows. Our platform combines ASR, LLMs, and TTS into production-grade AI systems used globally across enterprise environments where latency, reliability, and scalability matter.
We are hiring a Machine Learning Engineer to build low-latency production systems for our LLM team. This role is centred around writing scalable code that enables real-time conversational AI to perform reliably under heavy production workloads.
You’ll work closely with our LLM and speech teams to solve challenges around inference speed, concurrency, request handling, GPU performance, distributed systems, and real-time response streaming.
What you’ll do
What we’re looking for
Why this role?
You’ll work on designing and building low-latency conversational AI systems capable of serving large volumes of concurrent real-time requests. The role focuses on solving difficult engineering challenges around inference speed, reliability, concurrency, GPU performance, and scalable production AI systems.
Similar jobs
BT Group
Manchester, United Kingdom
1 day ago
Quaisr
London, United Kingdom
1 day ago
Bally's Interactive
Manchester, United Kingdom
1 day ago
Frontier Resourcing
Manchester, United Kingdom
1 day ago
bet365
Manchester, United Kingdom
1 day ago
Gambling.com Group
Manchester, United Kingdom
1 day ago
Deloitte
Manchester, United Kingdom
4 days ago
© 2026 Qureos. All rights reserved.