Qureos

FIND_THE_RIGHTJOB.

AI Backend Engineer — LLM Inference (DeepSeek, vLLM, FastAPI)

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

AI Backend Engineer — LLM Inference (DeepSeek, vLLM, FastAPI)

Job Type:

45-day contract (full-time), potential long-term role

About the Role:

We are building a real-time AI system using open-source LLMs. Your job is to install and optimize backend deep learning infrastructure. You will NOT work on business logic — only the engine.

Responsibilities:

  • Install, configure, and optimize DeepSeek R1 / V3 models
  • Deploy vLLM or LM Studio inference server
  • Build FastAPI backend to expose custom LLM APIs
  • GPU optimization & quantization (AWQ, GPTQ, FP8)
  • Manage model weights, tokenizers, streaming endpoints
  • Implement secure API access keys
  • Work closely with a system architect (CTO-level guidance provided)

Qualifications:

  • Strong Python + FastAPI skills
  • Experience with vLLM / TGI / Ollama / LM Studio
  • Deep learning fundamentals (PyTorch)
  • Knowledge of GPU environments (CUDA, cuDNN)
  • Experience deploying LLMs locally or on cloud

Job Type: Contract
Contract length: 45 days

Pay: Rs120,000.00 - Rs180,000.00 per month

Work Location: Remote

Similar jobs

No similar jobs found

© 2025 Qureos. All rights reserved.