Azure Data Lead

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

Azure Data Lead

Job ID: Azu-ETP-Pun-1198

Location: Pune,Other

The Role: As a Lead Data Engineer specializing in Databricks & Master Data Management, you will be a key player in designing, developing, and optimizing our next-generation data platform. You will lead a team of data engineers, providing technical guidance, mentorship, and ensuring the scalable, and high-performance data solutions.

Key Responsibilities:

Technical Leadership:
Lead the design, development, and implementation of scalable and reliable data pipelines using Databricks, Spark, and other relevant technologies.
Define and enforce data engineering best practices, coding standards, and architectural patterns.
Provide technical guidance and mentorship to junior and mid-level data engineers.
Conduct code reviews and ensure the quality, performance, and maintainability of data solutions.
Databricks Expertise:
Architect and implement data solutions on the Databricks platform, including Databricks Lakehouse, Delta Lake, and Unity Catalog.
Optimize Spark workloads for performance and cost efficiency on Databricks.
Develop and manage Databricks notebooks, jobs, and workflows.
Proficiently use Databricks features such as Delta Live Tables (DLT), Photon, and SQL Analytics.
Master Data Management:
Lead the technical design and implementation of the MDM solution (either using a dedicated tool or custom MDM on Databricks) for critical domains (e.g., Customer, Product, Vendor).
Define and implement data quality rules, entity resolution, matching, and survivorship logic to create and maintain ‘Golden Records.’
Partner with Data Governance and Data Stewardship teams to define and enforce organizational policies, standards, and data definitions for master data assets.
Ensure seamless and timely provisioning of high-quality master data from the MDM/Lakehouse platform to downstream consuming systems (ERP, CRM, BI).
Pipeline Development & Operations:
Develop, test, and deploy robust ETL/ELT pipelines for data ingestion, transformation, and loading from various sources (e.g., relational databases, APIs, streaming data).
Implement monitoring, alerting, and logging for data pipelines to ensure operational excellence.
Troubleshoot and resolve complex data-related issues.
Collaboration & Communication:
Work closely with cross-functional teams including product managers, data scientists, and software engineers.
Communicate complex technical concepts clearly to both technical and non-technical stakeholders.
Stay updated with industry trends and emerging technologies in data engineering and Databricks

Primary Skills :

Extensive hands-on experience with Databricks platform, including Databricks Workspace, Spark on Databricks, Delta Lake, and Unity Catalog.
Strong proficiency in optimizing Spark jobs and understanding Spark architecture.
Experience with Databricks features like Delta Live Tables (DLT), Photon, and Databricks SQL Analytics.
Deep understanding of data warehousing concepts, dimensional modeling, and data lake architectures.
Familiarity with data governance and cataloging tools (e.g., Purview, Profisee).

Similar jobs