Data Engineer – Job DescriptionWhat You’ll Do
As a Data Engineer, you'll be a key contributor to building and scaling robust data solutions across the organization. You will:
-
Architect Scalable Data Pipelines: Design, develop, and maintain reliable ETL/ELT workflows using Databricks, Spark, and Python.
-
Enable Data Access & Analytics: Partner with analytics, product, and engineering teams to ensure timely, accurate, and governed access to data for downstream reporting and analytics.
-
Optimize Data Workflows: Improve performance, reduce latency, and streamline processes by tuning SQL, optimizing Spark jobs, and enhancing cloud data pipelines.
-
Leverage Cloud Infrastructure: Utilize AWS services (e.g., S3, Glue, Lambda) to manage and scale data engineering workloads.
-
Drive Best Practices: Establish and maintain data engineering standards, including code quality, data security, version control, and documentation.
-
Build & Maintain Data Models: Construct and support dimensional and normalized data models that support cross-functional use cases and reporting needs.
-
Automation & Monitoring: Set up robust pipeline orchestration (e.g., with Airflow, Databricks Jobs, or AWS Step Functions) and monitoring/alerting systems.
-
Collaborate Cross-Functionally: Work with data analysts, scientists, and business users to understand requirements and transform raw data into business-ready datasets.
Must-Haves-
5+ years of experience as a Data Engineer or in a similar role.
-
Strong hands-on experience with Databricks (Spark, Delta Lake) and Python-based ETL frameworks.
-
Solid experience working with AWS cloud services for data processing and storage.
-
Proficient in SQL for data wrangling, transformation, and performance tuning.
-
Experience with data lake architectures, ELT/ETL development, and orchestration tools.
-
Familiarity with software engineering best practices, including CI/CD, version control, and code reviews.
-
Strong communication and collaboration skills; comfortable working with technical and non-technical stakeholders.
Nice-to-Haves-
Experience with Power BI or other BI tools (e.g., Tableau, Looker) to assist in data visualization or self-service reporting enablement.
-
Exposure to data governance and data quality frameworks.
-
Understanding of data cataloging tools and metadata management.