Back

Clustox
Principal Data Engineer- Azure
Location:
Islamabad, Pakistan
Department: Information Technology
Job Description
About the Project
We are a mission-driven team of developers, architects, ML engineers, and data specialists building an innovative cloud-based platform to combat coral reef degradation caused by global warming. By leveraging real-time data pipelines, AI/ML models, and scalable cloud architecture, we aim to deliver actionable insights for marine conservation.
What You'll Do
As a Senior Data Engineer, you'll design and optimize data systems that power our conservation efforts. Your work will directly impact our ability to monitor, analyze, and restore coral reefs at scale.
Core Responsibilities
- Build scalable ETL/ELT pipelines using Azure Data Factory, Databricks, and Synapse Analytics.
- Integrate real-time & batch data for AI/ML models (Azure ML, MLOps).
- Implement storage solutions (Azure Data Lake, Cosmos DB, SQL DB).
- Optimize pipelines for speed, cost, and reliability (caching, partitioning).
- Monitor, troubleshoot, and fine-tune data workflows.
- Prepare datasets for feature engineering and model training (PySpark, Pandas).
- Collaborate with data scientists to deploy and monitor ML models.
- Enforce data encryption, access controls, GDPR/HIPAA compliance.
- Work with frontend/backend engineers, DevOps, and conservation scientists.
- Enable data visualization (Power BI, Tableau) for stakeholders.
Who You Are
- 5+ years in data engineering, preferably with Azure cloud services.
- Expert in Python, PySpark, SQL, and big data frameworks.
- Experience with real-time data processing and ML pipeline integration.
- Passionate about sustainability, AI for good, or environmental tech.
- Strong problem-solver who thrives in collaborative, innovative teams.

Clustox
Principal Data Engineer- Azure