Qureos

FIND_THE_RIGHTJOB.

Data Scientist

India

DS (Vector Search + GCP )- Bangalore

Bangalore

JOB DESCRIPTION

Data/Applied scientist (Search)

  • Strong in Python and experience with Jupyter notebooks, Python packages like

polars, pandas, numpy, scikit-learn, matplotlib, etc.

  • Must have: Experience with machine learning lifecycle, including data

preparation, training, evaluation, and deployment

  • Must have: Hands-on experience with GCP services for ML & data science
  • Must have: Experience with Vector Search , Hybrid Search techniques, Query preprocessing
  • Must have: Experience with embeddings generation using models like BERT, Sentence

Transformers, or custom models

  • Must have: Experience in embedding indexing and retrieval (e.g.,

Elastic, FAISS, ScaNN, Annoy)

  • Must have: Experience with LLMs and use cases like RAG (Retrieval-Augmented Generation)
  • Must have: Understanding of semantic vs lexical search paradigms
  • Must have: Experience with Learning to Rank (LTR) techniques and libraries (e.g., XGBoost,

LightGBM with LTR support)

  • Should be proficient in SQL and BigQuery for analytics and feature generation
  • Should have experience with Dataproc clusters for distributed data processing using Apache

Spark or PySpark

  • Should have experience deploying models and services using Vertex AI, Cloud Run, or Cloud

Functions

  • Should be comfortable working with BM25 ranking (via Elasticsearch or OpenSearch) and

blending with vector-based approaches

  • Good to have: Familiarity with Vertex AI Matching Engine for scalable vector retrieval
  • Good to have: Familiarity with TensorFlow Hub, Hugging Face, or other model repositories
  • Good to have: Experience with prompt engineering, context windowing, and embedding

optimization for LLM-based systems

  • Should understand how to build end-to-end ML pipelines for search and ranking applications
  • Must have: Awareness of evaluation metrics for search relevance

(e.g., precision@k, recall, nDCG, MRR)

  • Should have exposure to CI/CD pipelines and model versioning practices

GCP Tools Experience:

ML & AI: Vertex AI, Vertex AI Matching Engine, AutoML, AI Platform

Storage: BigQuery, Cloud Storage, Firestore

Ingestion: Pub/Sub, Cloud Functions, Cloud Run

Search: Vector Databases (e.g., Matching Engine, Qdrant on GKE), Elasticsearch/OpenSearch

Compute: Cloud Run, Cloud Functions, Vertex Pipelines, Cloud Dataproc (Spark/PySpark)

CI/CD & IaC: GitLab/GitHub Actions

EXPERTISE AND QUALIFICATIONS

Data/Applied scientist (Search)

  • Strong in Python and experience with Jupyter notebooks, Python packages like

polars, pandas, numpy, scikit-learn, matplotlib, etc.

  • Must have: Experience with machine learning lifecycle, including data

preparation, training, evaluation, and deployment

  • Must have: Hands-on experience with GCP services for ML & data science
  • Must have: Experience with Vector Search , Hybrid Search techniques, Query preprocessing
  • Must have: Experience with embeddings generation using models like BERT, Sentence

Transformers, or custom models

  • Must have: Experience in embedding indexing and retrieval (e.g.,

Elastic, FAISS, ScaNN, Annoy)

  • Must have: Experience with LLMs and use cases like RAG (Retrieval-Augmented Generation)
  • Must have: Understanding of semantic vs lexical search paradigms
  • Must have: Experience with Learning to Rank (LTR) techniques and libraries (e.g., XGBoost,

LightGBM with LTR support)

  • Should be proficient in SQL and BigQuery for analytics and feature generation
  • Should have experience with Dataproc clusters for distributed data processing using Apache

Spark or PySpark

  • Should have experience deploying models and services using Vertex AI, Cloud Run, or Cloud

Functions

  • Should be comfortable working with BM25 ranking (via Elasticsearch or OpenSearch) and

blending with vector-based approaches

  • Good to have: Familiarity with Vertex AI Matching Engine for scalable vector retrieval
  • Good to have: Familiarity with TensorFlow Hub, Hugging Face, or other model repositories
  • Good to have: Experience with prompt engineering, context windowing, and embedding

optimization for LLM-based systems

  • Should understand how to build end-to-end ML pipelines for search and ranking applications
  • Must have: Awareness of evaluation metrics for search relevance

(e.g., precision@k, recall, nDCG, MRR)

  • Should have exposure to CI/CD pipelines and model versioning practices

GCP Tools Experience:

ML & AI: Vertex AI, Vertex AI Matching Engine, AutoML, AI Platform

Storage: BigQuery, Cloud Storage, Firestore

Ingestion: Pub/Sub, Cloud Functions, Cloud Run

Search: Vector Databases (e.g., Matching Engine, Qdrant on GKE), Elasticsearch/OpenSearch

Compute: Cloud Run, Cloud Functions, Vertex Pipelines, Cloud Dataproc (Spark/PySpark)

CI/CD & IaC: GitLab/GitHub Actions

Job Type: Full-time

Pay: Up to ₹1,700,000.00 per year

Work Location: In person

© 2025 Qureos. All rights reserved.