FIND_THE_RIGHTJOB.

Whatease

Senior Data Engineer

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

Data Engineer

Designation: Data Engineer Experience: 6-8 Years Location: Mumbai (onsite)

Job Summary

We are seeking a highly skilled Data Engineer with deep expertise in Apache Kafka integration with Databricks, structured streaming, and large-scale data pipeline design using the Medallion Architecture. The ideal candidate will demonstrate strong hands-on experience in building and optimizing real-time and batch pipelines, and will be expected to solve real coding problems during the interview.

Job Description

Key Responsibilities

Design, develop, and maintain real-time and batch data pipelines in Databricks

Integrate Apache Kafka with Databricks using Structured Streaming

Implement robust data ingestion frameworks using Databricks Autoloader

Build and maintain Medallion Architecture pipelines across Bronze, Silver, and Gold layers

Implement checkpointing, output modes, and appropriate processing modes in structured streaming jobs

Design and implement Change Data Capture (CDC) workflows and Slowly Changing Dimensions (SCD) Type 1 and Type 2 logic

Develop reusable components for merge/upsert operations and window function-based transformations Handle large volumes of data efficiently through proper partitioning, caching, and cluster tuning techniques Collaborate with cross-functional teams to ensure data availability, reliability, and consistency

Must Have Skills Apache Kafka

Integration, topic management, schema registry (Avro/JSON)

Databricks & Spark Structured Streaming Processing Modes: Append, Update, Complete

Output Modes: Memory, Console, File, Kafka, Delta Checkpointing and fault tolerance

Databricks Autoloader

Schema inference, schema evolution, incremental loads

Medallion Architecture

Full implementation expertise across Bronze, Silver, and Gold layers

Performance Optimization

Data partitioning strategies
Caching and persistence
Adaptive query execution and cluster configuration tuning

SQL & Spark SQL

Proficiency in writing efficient queries and transformations

Data Governance

Schema enforcement, data quality checks, and monitoring

Good to Have
Strong coding skills in Python and PySpark

Experience working in CI/CD environments for data pipelines Exposure to cloud platforms (AWS/Azure/GCP)
Understanding of Delta Lake, time travel, and data versioning Familiarity with orchestration tools like Airflow or Azure Data Factory

Job Types: Full-time, Permanent

Pay: ₹1,500,000.00 - ₹3,000,000.00 per year

Benefits:

Health insurance
Leave encashment
Life insurance
Paid sick time
Provident Fund

Ability to commute/relocate:

Navi Mumbai, Maharashtra: Reliably commute or planning to relocate before starting work (Required)

Location:

Navi Mumbai, Maharashtra (Preferred)

Work Location: In person

Similar jobs

Sr. Data Engineer (INDM)

Blue Altair

India

3 days ago

Python (Developer / Senior Developer) - Legal & Enterprise automation

Vyapi Lextech Solutions Pvt Ltd.

India

3 days ago

Data Scientist - AGM/ DGM/ GM

UltraTech Cement

Mumbai, India

3 days ago

Term of use Privacy policy