Foundational Model Engineer — Multimodal & Agentic Medical AI Systems

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

About the Role

You will be one of the earliest engineering hires responsible for building the technical backbone that powers our 3D-volume foundation model and the agentic medical AI systems built on top of it.

This role blends ML systems engineering, high-performance computing, and foundation-model infrastructure, enabling our research scientists to train and deploy cutting-edge multimodal models at scale.

You will design the pipelines, tooling, distributed systems, and evaluation frameworks that make world-class research possible—and usable in clinical settings.

If you're the kind of engineer who loves training clusters, PyTorch internals, scalable data loaders, CUDA kernels, model parallelism, and agentic inference systems, this is your role.

What You Will Work On

Model Training Infrastructure & Systems

Architect and maintain large-scale training pipelines for multimodal foundation models (3D volumes + text).
Implement distributed training using data parallelism, tensor parallelism, pipeline parallelism, and FSDP/ZeRO strategies.
Optimize training performance across A100/H100 clusters, including kernel-level optimizations and memory efficiency tuning.

Data & Multimodal Engineering

Build scalable ingestion, preprocessing, and storage systems for 3D medical volumes, DICOM series, voxel grids, and text datasets.
Create multimodal data loaders and augmentation pipelines for high-throughput training.
Work on dataset versioning, weak-label pipelines, and automatic metadata extraction.

Model Serving & Agent Runtime

Build and optimize inference runtimes for 3D-aware models and LLM-based medical agents.
Develop robust APIs and service layers for clinical workflows (retrieval, reporting, case summarization, multi-step agent chains).
Implement caching, quantization, batching, vector search, and agent orchestration.

Tooling & Collaboration

Develop tools for researchers: experiment launchers, logging/visualization dashboards, model evaluation notebooks, and reproducibility tooling.
Partner closely with scientists on rapid model iteration, ablations, and experimental design.
Participate in internal "ML performance tiger teams" to squeeze maximum throughput from models and data pipelines.

Why This Role Appeals to Top-Tier ML Systems Engineers

You get to build the entire foundational stack behind frontier multimodal models.
Rare opportunity to combine 3D infrastructure, LLM agents, medical workflows, and distributed systems.
Direct collaboration with researchers working on CLIP-style models, Chitrarth-type VLMs, document foundation models, and 3D multimodal architectures.
Massive technical scope with freedom to propose new tools, new pipelines, new optimization strategies.
Direct impact: your work will enable clinical-grade AI systems used in radiology and beyond.

What We're Looking For

Strong engineering experience with PyTorch, JAX, or DeepSpeed, plus hands-on distributed training expertise.
Deep understanding of GPU internals, CUDA kernels, NCCL, memory profiling, and high-performance data pipelines.
Experience building large-scale ML pipelines, especially for multimodal or heavy-data workloads (video, 3D, imaging).
Familiarity with cloud or on-prem HPC scheduling: Slurm, Kubernetes, Ray, etc.
Proficiency in Python + C++/CUDA; strong command of Linux systems.
Ability to collaborate deeply with researchers, contribute ideas, and own end-to-end engineering projects.

Nice to Have

Experience with 3D data (MRI/CT, LiDAR, voxels, meshes, NeRFs).
Exposure to vector search (FAISS, Milvus, Annoy) and embedding retrieval systems.
Experience with agent frameworks, LLM serving, or multimodal inference pipelines.
Contributions to open-source ML systems or performance optimization libraries.
Background in healthcare/medical imaging pipelines (DICOM, PACS, segmentation workflows).

What We Offer

Competitive compensation.
World-class compute access.
Opportunity to build the core infrastructure for India's first 3D multimodal foundation model.
Close collaboration with researchers, clinicians, and product teams.
Autonomy, ownership, and the chance to shape the technical architecture from the ground up.

Similar jobs

No similar jobs found

Term of use Privacy policy