Key Responsibilities:
- Design, train, and fine-tune LLMs for specific business applications using open-source and proprietary models.
- Build and scale NLP pipelines for text generation, summarization, classification, Q&A, and semantic search.
- Develop and implement Retrieval-Augmented Generation (RAG) systems using vector databases.
- Integrate LLM-based solutions into production environments ensuring scalability, performance, and reliability.
- Optimize inference performance of large models for real-time applications.
- Collaborate with data scientists, product managers, and backend engineers to deliver intelligent features end-to-end.
- Leverage prompt engineering and custom adapters (LoRA, PEFT, etc.) for model tuning and efficiency.
- Maintain reproducible ML workflows using MLOps tools and CI/CD practices.
Required Skills & Qualifications:
- 5+ years of total experience in software/ML engineering with at least 2 years in LLMs/NLP applications.
- Proficiency in Python and hands-on experience with ML/DL frameworks such as PyTorch or TensorFlow.
- Deep understanding of transformer architectures and modern LLMs (GPT, LLaMA, Claude, Gemini, etc.).
- Experience with prompt engineering, fine-tuning, or instruction tuning.
- Practical experience with RAG pipelines, embedding models, and vector databases like FAISS, Pinecone, or Weaviate.
- Experience deploying LLM applications in production at scale (preferably on cloud platforms like AWS, Azure, or GCP).
Job Type: Full-time
Pay: ₹476,387.04 - ₹1,689,471.78 per year
Work Location: In person