The Junior Data Engineer will be responsible for creating, maintaining, and modifying data pipelines for the reporting and system’s needs.
DUTIES & RESPONSIBILITIES
- 
Design, build, and orchestrate scalable data workflows and architecture aligned with business goals. Develop and maintain robust batch and real-time data pipelines using modern tools and platforms.
- 
Manage structured and unstructured databases with a focus on performance, availability, and optimisation
- 
Ensure high standards of data quality, consistency, and accessibility across integrated systems
- 
Collaborate with cross-functional teams to gather requirements and translate them into data solutions
- 
Implement data governance standards covering security, privacy, and compliance
- 
Proactively monitor and troubleshoot system performance, ensuring uptime and reliability of pipelines
- 
Transform raw data into clean, well-documented, and report-ready datasets
- 
Maintain clear documentation of data processes, workflows, and architecture for knowledge sharing
- 
Other duties as assigned by the manager
KNOWLEDGE & EXPERIENCE
Education:
- 
Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field
Experience:
- 
3+ years of experience working with data-driven solutions, with a focus on transforming healthcare or operational datasets into actionable insights to support process improvement and analytics initiatives.
- 
2+ years of experience collaborating with stakeholders to design, build, and deploy scalable data engineering deliverables, including data pipelines, ETL workflows, and integrated data models tailored for analytical needs
- 
2+ years of experience creating and maintaining self-service data infrastructure, such as datasets, dashboards, and backend integrations that empower end-users to explore metrics and monitor KPIs through BI tools
Credentials:
- 
Industry certifications in data engineering or cloud platforms/frameworks.
Knowledge and Skills:
- 
Data architecture design and data modeling (relational & dimensional), Development of batch and real-time data pipelines using ETL/ELT tools
- 
Strong proficiency in SQL, Python, and optionally Scala/Java
- 
Hands-on experience with cloud platforms: AWS, Azure, GCP
- 
Advanced database management with SQL Server, PostgreSQL, MongoDB, etc.
- 
Experience with big data technologies like Spark, Hadoop, and Kafka
- 
Data quality assurance, profiling, cleansing, and governance practices
- 
Use of orchestration and workflow tools such as Apache Airflow, DBT, and Talend
- 
Expertise in Microsoft Excel (pivot tables, macros, Google scripts)
- 
Familiar with documentation and design tools: Microsoft Word, PowerPoint, Visio, G Suite
- 
Skilled in handling and optimizing large-scale datasets and infrastructure