We are seeking an experienced Data Engineer to join our Data & AI team and deliver core data platform capabilities. This hands-on role focuses on building robust Python-based data pipelines, ETL processes, and data models across a modern data lake/lakehouse environment. Strong Python and SQL skills are essential, along with a solid understanding of data lake architecture. Experience with data quality tooling (Informatica, Purview, Great Expectations, or similar) is a strong advantage.
Key Responsibilities
-
Design and maintain end-to-end Python-based data pipelines for ingestion, transformation, and delivery.
-
Build and optimise ETL/ELT workflows across Bronze, Silver, and Gold layers using Medallion architecture.
-
Write clean, modular, production-grade Python code for data processing and automation.
-
Support data model, schema, and storage design for analytics and reporting.
-
Develop SQL transformations, stored procedures, and views for validation and reconciliation.
-
Build ingestion frameworks for files, APIs, databases, and streaming sources.
-
Implement data quality checks across completeness, validity, consistency, uniqueness, accuracy, and timeliness.
-
Monitor pipeline health, implement alerting, and troubleshoot production issues.
-
Contribute to CI/CD pipelines, automated testing, and version control.
-
Produce clear documentation for pipelines, models, and runbooks.
Required Skills
-
Strong Python development (pandas, PySpark, etc.) and production pipeline experience.
-
Solid SQL for complex transformations and performance tuning.
-
Hands-on ETL/ELT design and implementation.
-
Understanding of Medallion architecture and modern data lake/lakehouse patterns.
-
Experience with orchestration tools.
-
Working knowledge of cloud platforms (Azure, AWS, or GCP).
-
Familiarity with relational and NoSQL databases.
-
Strong debugging and problem‑solving skills.
Desirable Skills
-
Experience with data quality tools (Informatica IDQ, Purview, Great Expectations, custom DQ frameworks).
-
Knowledge of data cataloguing/lineage tools (Informatica EDC, Purview).
-
Familiarity with data governance platforms (Axon, Purview).
-
Exposure to BI tools such as Power BI.