FIND_THE_RIGHTJOB.

Atman Artwork - Team Silambarasan TR

Systems Engineer

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

As a System Engineer you will deploy, optimize, and maintain the local AI systems including large language models (LLMs), embedding generators, rerankers, and retrieval pipelines. The role focuses on ensuring reliable local inference, policy‑safe routing, and end‑to‑end RAG performance within a fully private environment.

Responsibilities:

* Deploy and configure local LLMs (Ollama/vLLM) for low‑latency chat and retrieval tasks.
* Integrate embedding models and rerankers (e.g., bge, jina, gte, or Hugging Face alternatives).
* Implement hybrid retrieval (BM25 + vector) pipelines with pgvector.
* Own and maintain the policy engine controlling model routing and classification (local vs external).
* Conduct performance benchmarking and quantization tests for different model sizes.
* Tune model parameters for optimal inference on available GPUs.
* Collaborate with Backend engineers to wire AI inference APIs into FastAPI services.
* Develop scripts to monitor model uptime, latency, and retrieval quality.
* Maintain reproducibility: model versions, config hashes, and deterministic inference logs.
* Contribute to the Q‑CERT pipeline with model metadata and audit hashes.

Required Skills:

* Python (LangChain or LlamaIndex).
* Hugging Face Transformers and embeddings.
* Familiarity with Ollama, vLLM, or text‑generation‑inference.
* Basic GPU management, CUDA, and quantization (GGUF, GPTQ, AWQ).
* Understanding of RAG systems and evaluation metrics.
* Linux environment management and containerized inference (Docker).

Preferred (Bonus):

* Experience with fine‑tuning or LoRA adapters.
* Familiarity with vector DBs (pgvector, FAISS).
* Exposure to model evaluation tools (RAGAS, DeepEval).
* Knowledge of policy enforcement or prompt‑guard frameworks.

Work Style:

* Works closely with Backend / Infra Engineer for deployment and data pipelines.
* Weekly sync with Frontend team to validate outputs and UI integration.
* Expected to test and log all model benchmarks before production use.
* Operates in a secure internal environment — zero cloud data leakage allowed.

Notes:

Initial 3‑month engagement with option to extend based on model stability, performance gains, and adherence to privacy protocols.

Job Type: Full-time

Pay: ₹25,000.00 - ₹45,000.00 per month

Benefits:

Health insurance
Paid time off
Provident Fund

Work Location: In person

Similar jobs

Playwright Automation Testing

Capgemini

India

5 days ago

Junior Service Technician

Navin Infrasolutions Pvt. Ltd.

India

5 days ago

Electrical/Mechanical Maintenance Engineer

Zaron Metal Sections India Private Limited

India

5 days ago

Electrical Engineer (Maintenance) FAS

Riverview City Constructions Ltd.

India

5 days ago

AI Developer

bhramakar solution pvt ltd

India

5 days ago

AI Champion

BigR.io

India

5 days ago

3470-Intern - Legal (AI Engineer)

Innovaccer

Uttar Tola, India

5 days ago

Term of use Privacy policy