FIND_THE_RIGHTJOB.

Databricks Developer

India

Position Description:

We are seeking a skilled Databricks Developer to join our Data Development team. Reporting to the Team Lead, Data Development, the Databricks Developer implements robust data pipelines using Apache Spark on Databricks, supports advanced data transformation, and enables scalable data products that serve enterprise analytics and reporting needs. You’ll work closely with data engineers and analysts to ensure high-performance, reliable data pipelines and quality outputs.

This is a hands-on development role focused on engineering scalable, maintainable, and optimized data flows in a modern cloud-based environment.

Job Responsibilities:

Design, build, and maintain scalable data pipelines and workflows using Databricks (SQL, PySpark, Delta Lake).
Develop efficient ETL/ELT pipelines for structured and semi-structured data using Azure Data Factory (ADF) and Databricks notebooks/jobs.
Integrate and transform large-scale datasets from multiple sources into unified, analytics-ready outputs.
Optimize Spark jobs and manage Delta Lake performance using techniques such as partitioning, Z-ordering, broadcast joins, and caching.
Design and implement data ingestion pipelines for RESTful APIs, transforming JSON responses into Spark tables.
Apply best practices in data modeling and data warehousing concepts.
Perform data validation and quality checks.
Work with various data formats, including JSON, Parquet, and Avro.
Build and manage data orchestration pipelines, including linked services and datasets for ADLS, Databricks, and SQL Server.
Create parameterized and dynamic ADF pipelines, and trigger Databricks notebooks from ADF.
Collaborate closely with Data Scientists, Data Analysts, Business Analysts, and Data Architects to deliver trusted, high-quality datasets.
Contribute to data governance, metadata documentation, and ensure adherence to data quality standards.
Use version control tools (e.g., Git) and CI/CD pipelines to manage code deployment and workflow changes.
Develop real-time and batch processing pipelines for streaming data sources such as MQTT, Kafka, and Event Hub.

Requirements:

5+ years of experience in data engineering or big data development.
Bachelor's degree in computer science or a relevant field, or equivalent training and work experience
Strong hands-on experience with Databricks and Apache Spark (PySpark/SQL).
Proven experience with Azure Data Factory, Azure Data Lake, and related Azure services.
Experience integrating with APIs using libraries such as requests and http.
Deep understanding of Delta Lake architecture, including performance tuning and advanced features.
Proficiency in SQL and Python for data processing, transformation, and validation.
Familiarity with data lakehouse architecture and both real-time and batch processing design patterns.
Comfortable working with Git, DevOps pipelines, and Agile delivery methodologies

Additional Desired Qualifications:

Experience with dbt, Azure Synapse, or Microsoft Fabric.
Familiarity with Unity Catalog features in Databricks.
Relevant certifications such as Azure Data Engineer, Databricks, or similar.
Understanding of predictive modeling, anomaly detection, or machine learning, particularly with IoT datasets.

Job Demands and/or Physical Requirements: