FIND_THE_RIGHTJOB.

Waayslive solution

Data Scientist

India

DS (Vector Search + GCP )- Bangalore

Bangalore

JOB DESCRIPTION

Data/Applied scientist (Search)

Strong in Python and experience with Jupyter notebooks, Python packages like

polars, pandas, numpy, scikit-learn, matplotlib, etc.

Must have: Experience with machine learning lifecycle, including data

preparation, training, evaluation, and deployment

Must have: Hands-on experience with GCP services for ML & data science
Must have: Experience with Vector Search , Hybrid Search techniques, Query preprocessing
Must have: Experience with embeddings generation using models like BERT, Sentence

Transformers, or custom models

Must have: Experience in embedding indexing and retrieval (e.g.,

Elastic, FAISS, ScaNN, Annoy)

Must have: Experience with LLMs and use cases like RAG (Retrieval-Augmented Generation)
Must have: Understanding of semantic vs lexical search paradigms
Must have: Experience with Learning to Rank (LTR) techniques and libraries (e.g., XGBoost,

LightGBM with LTR support)

Should be proficient in SQL and BigQuery for analytics and feature generation
Should have experience with Dataproc clusters for distributed data processing using Apache

Spark or PySpark

Should have experience deploying models and services using Vertex AI, Cloud Run, or Cloud

Functions

Should be comfortable working with BM25 ranking (via Elasticsearch or OpenSearch) and

blending with vector-based approaches

Good to have: Familiarity with Vertex AI Matching Engine for scalable vector retrieval
Good to have: Familiarity with TensorFlow Hub, Hugging Face, or other model repositories
Good to have: Experience with prompt engineering, context windowing, and embedding

optimization for LLM-based systems

Should understand how to build end-to-end ML pipelines for search and ranking applications
Must have: Awareness of evaluation metrics for search relevance

(e.g., precision@k, recall, nDCG, MRR)

Should have exposure to CI/CD pipelines and model versioning practices

GCP Tools Experience:

ML & AI: Vertex AI, Vertex AI Matching Engine, AutoML, AI Platform

Storage: BigQuery, Cloud Storage, Firestore

Ingestion: Pub/Sub, Cloud Functions, Cloud Run

Search: Vector Databases (e.g., Matching Engine, Qdrant on GKE), Elasticsearch/OpenSearch

Compute: Cloud Run, Cloud Functions, Vertex Pipelines, Cloud Dataproc (Spark/PySpark)

CI/CD & IaC: GitLab/GitHub Actions

EXPERTISE AND QUALIFICATIONS

Data/Applied scientist (Search)

Strong in Python and experience with Jupyter notebooks, Python packages like

polars, pandas, numpy, scikit-learn, matplotlib, etc.

Must have: Experience with machine learning lifecycle, including data

preparation, training, evaluation, and deployment

Must have: Hands-on experience with GCP services for ML & data science
Must have: Experience with Vector Search , Hybrid Search techniques, Query preprocessing
Must have: Experience with embeddings generation using models like BERT, Sentence

Transformers, or custom models

Must have: Experience in embedding indexing and retrieval (e.g.,

Elastic, FAISS, ScaNN, Annoy)

Must have: Experience with LLMs and use cases like RAG (Retrieval-Augmented Generation)
Must have: Understanding of semantic vs lexical search paradigms
Must have: Experience with Learning to Rank (LTR) techniques and libraries (e.g., XGBoost,

LightGBM with LTR support)

Should be proficient in SQL and BigQuery for analytics and feature generation
Should have experience with Dataproc clusters for distributed data processing using Apache

Spark or PySpark

Should have experience deploying models and services using Vertex AI, Cloud Run, or Cloud

Functions

Should be comfortable working with BM25 ranking (via Elasticsearch or OpenSearch) and

blending with vector-based approaches

Good to have: Familiarity with Vertex AI Matching Engine for scalable vector retrieval
Good to have: Familiarity with TensorFlow Hub, Hugging Face, or other model repositories
Good to have: Experience with prompt engineering, context windowing, and embedding

optimization for LLM-based systems

Should understand how to build end-to-end ML pipelines for search and ranking applications
Must have: Awareness of evaluation metrics for search relevance

(e.g., precision@k, recall, nDCG, MRR)

Should have exposure to CI/CD pipelines and model versioning practices

GCP Tools Experience:

ML & AI: Vertex AI, Vertex AI Matching Engine, AutoML, AI Platform

Storage: BigQuery, Cloud Storage, Firestore

Ingestion: Pub/Sub, Cloud Functions, Cloud Run

Search: Vector Databases (e.g., Matching Engine, Qdrant on GKE), Elasticsearch/OpenSearch

Compute: Cloud Run, Cloud Functions, Vertex Pipelines, Cloud Dataproc (Spark/PySpark)

CI/CD & IaC: GitLab/GitHub Actions

Job Type: Full-time

Pay: Up to ₹1,700,000.00 per year

Work Location: In person

Similar jobs

Software Dev Engineer II, TSE Ops Tech

Amazon.com

India

10 days ago

Senior Software Engineer II

RELX Group

India

10 days ago

Senior Data Scientist

Infoorigin Inc

Uttar Tola, India

10 days ago

Senior Data Analyst - Digital Finance

Novartis

India

10 days ago

Senior Business Analyst

HDFC securities

Mumbai, India

10 days ago

Manager, Digital Supply Chain Visualization Engineer

MSD

India

10 days ago

Data Analyst

McCain Foods

Turigram, India

10 days ago

Term of use Privacy policy