AI Backend Engineer — LLM Inference (DeepSeek, vLLM, FastAPI)

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

AI Backend Engineer — LLM Inference (DeepSeek, vLLM, FastAPI)

Job Type:

45-day contract (full-time), potential long-term role

About the Role:

We are building a real-time AI system using open-source LLMs. Your job is to install and optimize backend deep learning infrastructure. You will NOT work on business logic — only the engine.

Responsibilities:

Install, configure, and optimize DeepSeek R1 / V3 models
Deploy vLLM or LM Studio inference server
Build FastAPI backend to expose custom LLM APIs
GPU optimization & quantization (AWQ, GPTQ, FP8)
Manage model weights, tokenizers, streaming endpoints
Implement secure API access keys
Work closely with a system architect (CTO-level guidance provided)

Qualifications:

Strong Python + FastAPI skills
Experience with vLLM / TGI / Ollama / LM Studio
Deep learning fundamentals (PyTorch)
Knowledge of GPU environments (CUDA, cuDNN)
Experience deploying LLMs locally or on cloud

Job Type: Contract
Contract length: 45 days

Pay: Rs120,000.00 - Rs180,000.00 per month

Work Location: Remote

Similar jobs

No similar jobs found

Term of use Privacy policy