Seeking a skilled Data Engineer to design, build, and optimize modern data pipelines and analytics platforms. The ideal candidate will have strong hands-on experience with Snowflake, dbt, Airflow, and Python/Spark, along with solid SQL and data modeling skills. You’ll collaborate with cross-functional teams—data analysts, data scientists, and business stakeholders—to ensure data reliability, scalability, and Performance.
Design, develop, and maintain data pipelines for ingestion, transformation, and delivery using Airflow, dbt, and Snowflake.
Implement and optimize ETL/ELT workflows for structured and semi-structured data.
Develop and maintain data models, schemas, and views in Snowflake to support analytics and reporting.
Build and manage data processing frameworks using Spark (PySpark or Spark SQL).
Integrate data from various sources (databases, APIs, files, cloud storage, streaming data).
Monitor data pipelines for performance, reliability, and cost optimization.
Implement data quality checks, observability, and error handling mechanisms.
Collaborate with data analysts/scientists to understand data needs and deliver scalable solutions.
Apply CI/CD best practices for data pipeline deployment and version control (Git).
Ensure compliance with data governance, security, and privacy policies.
3–7 years of experience in data engineering or a related field.
Strong expertise with Snowflake (warehousing concepts, performance tuning, cost optimization, security).
Proven experience with dbt (data modeling, testing, documentation, modular SQL).
Hands-on experience with Apache Airflow (DAG design, scheduling, orchestration).
Proficiency in SQL and Python for data manipulation and automation.
Experience with Apache Spark (PySpark preferred).
Strong understanding of ETL/ELT design patterns, data modeling (Kimball, Data Vault), and dimensional modeling.
Experience with Git, CI/CD, and Cloud Platforms (AWS, Azure, or GCP).
Knowledge of data quality, observability, and monitoring tools (e.g., Great Expectations, Monte Carlo, or similar).
Experience with streaming technologies (Kafka, Kinesis, or Pub/Sub).
Exposure to data cataloging and governance tools (e.g., Collibra, Alation, Amundsen).
Familiarity with Looker, Power BI, or Tableau for data consumption.
Experience with infrastructure as code (Terraform, CloudFormation).