Find The RightJob.

Senior AI Data Engineer

Job Summary

The Senior AI Data Engineer is responsible for designing, building, and optimizing enterprise-scale data and AI infrastructure to support machine learning models, generative AI applications, and real-time analytics. The role drives the development of end-to-end data pipelines, from ingestion to production-ready AI data products, ensuring scalability, performance, and compliance across multi-cloud environments.

Accountability & Responsibilities

Design, build, and maintain scalable ETL/ELT data pipelines using modern data engineering tools (e.g., Apache Spark, dbt).
Architect and implement Lakehouse data platforms (Delta Lake, Apache Iceberg, Apache Hudi) following Medallion architecture (Bronze/Silver/Gold).
Develop real-time streaming pipelines using Apache Kafka, Apache Flink, and Spark Structured Streaming.
Build and optimize AI/GenAI data pipelines for LLM training, fine-tuning, and inference (tokenization, dataset curation, prompt engineering datasets).
Design and implement Retrieval-Augmented Generation (RAG) pipelines, including embedding workflows and vector database integration.
Manage feature stores for real-time and batch machine learning use cases.
Integrate data pipelines with AI/ML platforms (Databricks MLflow, Azure ML, AWS SageMaker, Vertex AI, OpenAI/Azure OpenAI).
Implement data orchestration workflows using Apache Airflow or similar tools with CI/CD pipelines.
Ensure data quality, governance, and security using frameworks such as Great Expectations and data catalog tools.
Deploy and manage infrastructure using Infrastructure-as-Code tools (Terraform, Bicep, CDK).
Collaborate with Data Scientists, ML Engineers, and Solution Architects to deliver production-ready AI solutions.
Lead technical design decisions, mentor junior engineers, and contribute to data platform strategy.
Maintain documentation, data contracts, and operational runbooks for all pipelines.

Requirements

1 – Required Experience

Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related field.
4–5 years of experience in data engineering, with strong exposure to AI/ML data infrastructure.
Proven experience building scalable data pipelines and working with large-scale datasets.
Hands-on experience with AI/ML platforms and modern data architectures.
Experience in regulated industries (e.g., Banking, Telecom, Healthcare) is a plus.
Strong problem-solving, analytical thinking, and communication skills.
Experience working in cross-functional teams and agile environments.

2– Technical Skills