JOB SUMMARY:
We are looking for a Data Scientist (GenAI, NLP) to build and optimize the platform for a Generative AI assistant (complex RAG system with several data sources and tool calling). The ideal candidate is experienced in Python and ML libraries, building LLM systems, and using Azure services.
JOB RESPONSIBILITIES:
- Understand the business problem, challenge of existing technologies and areas of application for AI technologies
- Identify and choose the right AI or cognitive computing technologies for solving problems and formulate AI recipes for development
- Develop required machine learning models or prototype applications applying formulated AI recipes and verify the problem/solution fit
- Set up and manage the AI development and production infrastructure
- Help product managers and business stakeholders understand the potential and limitations of AI when planning new products
- Collaborate with other engineers and data scientists to enable seamless AI interactions.
- Build data ingest and data transformation infrastructure
- Identify new training datasets, problem transfer learning opportunities and adapt
- Build AI models from scratch, deploy them into production and help stakeholders understand the outcomes
- Create APIs and help business customers transfer results from the developed AI models to operations
- Recommend AI solutions that can improve the existent products.
JOB QUALIFICATIONS:
- 4-5 years programming experience working on AI/ML products & models
- Solid programming language skills in Python
- Experience in building LLM/RAG systems (Langchain, LangGraph, LangSmith – preferred, Agents, etc.)
- Strong knowledge of machine learning related tools and frameworks for data management, visualization, implementation and performance evaluation, such as scikit-learn, numpy, pandas, etc.
- Demonstrated ability to deliver working solutions on a tight schedule
- Very strong algorithmic thinking
- Experience with versioning tools (e.g. Git preferred)
- Solid understanding of RAG systems and approaches, NLP, data science process, statistical modeling, machine learning, deep learning
- Strong hands-on experience with developing RESTful APIs, preferably using FastAPI
- Cloud development (Docker, Kubernetes) and using Cloud specific AI technologies (Azure Machine Learning Services – strong preference, with AI Foundry (AI ML Studio), Azure Prompt Flow, Azure Functions and Logic Apps).
- Database experience (SQL, at least).
NICE TO HAVE:
- Experience building AI models in frameworks such as Keras, TensorFlow or Theano
- Working with Unix or unix-like operating systems
- Using quantitative methods in business setting background is a plus
- Experience in Azure Cloud deployment, Azure AI Evaluation tools