Overview:
The Principal Data Software Engineer is responsible for designing, developing, and maintaining data-driven systems, including ETL pipelines, backend services, and machine learning workflows. This role plays a critical part in enabling reliable data processing, transformation, and delivery across multiple environments, ensuring high data quality, scalability, and performance.
The position requires strong expertise in Python-based development, data engineering practices, and modern data platforms, along with the ability to support production systems and continuously improve data workflows.
Key Responsibilities:
-
Design, develop, and maintain Python-based backend services using frameworks such as FastAPI, Django, and Prefect
-
Build and manage scalable ETL pipelines and data workflows across landing, consumption, and application layers
-
Transform, clean, and validate data to ensure accuracy, consistency, and quality across datasets
-
Design and maintain relational database schemas, including writing and managing database migrations
-
Develop and orchestrate machine learning workflows, supporting model integration and data pipelines
-
Containerize applications and manage deployments using Kubernetes and Helm
-
Build and maintain CI/CD pipelines to enable automated testing, deployment, and delivery
-
Implement and enhance observability practices, including monitoring pipeline health, performance, and logs
-
Apply data governance, security, and compliance standards, ensuring proper data handling and access controls
-
Provide production support for data, ETL, and ML services, including incident investigation, root cause analysis, and resolution
-
Collaborate with data scientists, backend engineers, and DevOps teams to deliver end-to-end data solutions
-
Continuously optimize data pipelines and systems for performance, scalability, and cost efficiency.
Qualifications:
Education:
-
Bachelor’s degree in Computer Science, Software Engineering, Data Engineering, or a related technical field.
Experience:
-
6–10 years of experience in Data Engineering.
-
Experience with Python backend frameworks such as FastAPI and Django.
-
Experience building ETL pipelines using workflow orchestration tools such as Airflow and Prefect, and data transformation libraries such as pandas.
-
Solid experience with relational databases (PostgreSQL) and database migration tools.
Skills & Competencies:
-
Strong proficiency in Python and SQL.
-
Familiarity with developing machine learning and forecasting models, and using ML experiment tracking tools.
-
Familiarity with containerization (Docker), orchestration (Kubernetes, Helm), and CI/CD pipelines.
-
Familiarity with observability tooling and experience writing automated tests using Pytest.
-
Strong troubleshooting skills and ability to support production environments