FIND_THE_RIGHTJOB.

Data Engineer

What’s important to us:

We are looking for a skilled and experienced Data Engineer with over 4 years of professional experience in building, automating, and optimizing data pipelines and cloud-based architectures. The ideal candidate will have hands-on experience with cloud data services (AWS, Azure, or GCP) and CI/CD pipelines for deploying scalable, reliable, and secure data solutions.

The candidate will collaborate with cross-functional teams including data analysts, data scientists, and software engineers to design and maintain robust data infrastructure that supports analytics, AI/ML workflows, and enterprise reporting systems.

Key Responsibilities:

Design, build, and maintain end-to-end ETL/ELT pipelines using both on-premise and cloud-based technologies.
Architect and operate data storage and streaming solutions leveraging cloud-based services on AWS, Azure, or GCP.
Design and implement data ingestion and transformation workflows using Airflow, AWS Glue, or Azure Data Factory.
Develop and optimize data pipelines using Python and PySpark for large-scale distributed data processing.
Build data models — normalized, denormalized, and dimensional (Star/Snowflake) — for analytics and warehousing solutions.
Implement data quality, lineage, and governance using metadata management and monitoring tools.
Collaborate with cross-functional teams to deliver clean, reliable, and timely data for analytics and machine learning use cases.
Integrate CI/CD pipelines for data infrastructure deployment using GitHub Actions, Jenkins, or Azure DevOps.
Automate infrastructure provisioning using Infrastructure as Code (IaC) tools such as AWS CloudFormation or Terraform.
Monitor and optimize data processing performance for scalability, reliability, and cost-efficiency.
Enforce data security policies and ensure compliance with standards such as GDPR and HIPAA.

Must-Have Skills & Qualifications:

Education: Bachelor’s or Master’s degree in Computer Science, Information Technology, Data Engineering, or a related field.
Experience: Minimum 4 years of hands-on experience as a Data Engineer or in data-intensive environments.
SQL Expertise: Advanced proficiency in SQL for complex queries, joins, window functions, and performance tuning.
Analytical Databases: Experience working with Snowflake, Amazon Redshift, Google BigQuery, Azure Synapse, and PostgreSQL.
Query Optimization: Skilled in query optimization, indexing, and execution plan analysis for high-performance analytics workloads.
Programming: Proficient in Python and PySpark for data manipulation, automation, and pipeline orchestration.
Data Processing Frameworks: Strong understanding of Apache Spark (RDD, DataFrame, Spark SQL, optimization), Hive, Hadoop, and Flink for large-scale distributed data processing.
ETL/ELT Frameworks: Hands-on experience designing and maintaining pipelines using Airflow, AWS Glue, or Azure Data Factory.
Data Integration Patterns: Familiar with incremental loading, Slowly Changing Dimensions (SCD), Change Data Capture (CDC), and error handling in data pipelines.
Data Modeling: Expertise in data modeling, schema design, and building normalized, denormalized, and dimensional (Star/Snowflake) schemas.
Data Architecture: Strong understanding of Data Warehousing, Data Lakes, and Lakehouse architectures, including Delta Lake, ACID transactions, and partitioning strategies.
Cloud Platforms: Practical experience with major cloud ecosystems —
- AWS: S3, Glue, Redshift, Athena, Lambda, Step Functions, EMR
- Azure: Data Factory, Data Lake Gen2, Synapse, Databricks
Cloud Security: Experience managing IAM roles, access control, and encryption in cloud environments.
Pipeline Optimization: Skilled in optimizing data pipelines for performance, scalability, and cost-efficiency.
CI/CD and DevOps: Hands-on experience with CI/CD tools such as GitHub Actions, GitLab CI, or Azure DevOps.
Version Control: Proficient with Git and familiar with agile development practices.

Good-to-Have Skills:

Experience with containerization and orchestration.
Exposure to data cataloging and governance tools.
Experience with monitoring tools .
Familiarity with data APIs and microservices architecture.
Certification in cloud data engineering (e.g., AWS Certified Data Engineer, Azure Data Engineer Associate, or GCP Professional Data Engineer).
Experience supporting machine learning and analytics pipelines.

Soft Skills: