Location: Bahria Town, Lahore (Onsite)
Working Hours: 10:00 AM – 07:00 PM (Monday – Friday)
Job Type: Full-time
Role Overview
We are looking for a Python-first Engineer to own and evolve our Apache Airflow ecosystem. This is a pure engineering role—not a BI or reporting position. You will be responsible for architecting the "engine" behind our data movement, moving beyond simple cron jobs to design sophisticated, production-grade orchestration systems. If you have deep experience with the TaskFlow API, distributed task execution, and containerized data environments, we want to hear from you.
Experience Requirements (Non-Negotiable)
- 3+ years of hands-on production experience with Apache Airflow (2.x). Must be comfortable with DAG design patterns and the TaskFlow API.
- 4+ years in Software or Data Engineering with a strong focus on backend systems.
- Advanced Python proficiency: Writing modular, testable code using Pytest and following PEP 8 standards.
- Infrastructure Mastery: Proven experience with Docker/Docker Compose, PostgreSQL metadata management, and Redis-based messaging.
- Distributed Systems: Clear understanding of Celery Executor concepts and XCom backend optimization.
Key Responsibilities
1. Data Platform & Orchestration Engineering
- Build and maintain production-grade Airflow DAGs that are resilient and scalable.
- Architect clean Extract → Transform → Load layers with a focus on reusability.
- Develop shared ingestion frameworks and modular "plug-and-play" data modules.
- Implement rigorous schema validation (JSON Schema) and dynamic task execution logic.
- Design advanced retry strategies, custom operators, and failure handling mechanisms.
2. Distributed Workflow & Messaging
- Optimize the Celery Executor and manage Redis message brokering for high-volume tasks.
- Maintain the integrity of PostgreSQL metadata and monitor pipeline state/health.
- Implement best practices for logging, traceability, and platform-wide observability.
3. Infrastructure & Engineering Excellence
- Manage environment separation (Dev/Staging/Prod) using containerized workflows.
- Standardize repository organization and enforce strict Git workflow discipline.
- Integrate CI/CD pipelines to automate the deployment of data workflows.
- Handle production data incidents with a focus on root-cause analysis and long-term resilience.
What We’re Looking For
- Systems-thinking mindset: You don't just fix a bug; you fix the architecture that allowed it.
- Reliability Obsession: You build pipelines that fail gracefully and recover automatically.
- Clean Code Advocate: You believe data code should be as rigorous as application code.
- Ownership: You take pride in the uptime and health of the platforms you build.
Job Type: Full-time
Pay: Rs300,000.00 - Rs400,000.00 per month
Work Location: In person