We’re looking for a hands-on Data Engineer to join our team and lead the development of robust data pipelines powering the IllumiFi Analytics platform. You’ll work with cutting-edge, cloud-native tools to architect, implement, and optimize serverless data solutions on AWS.
If you enjoy working with Python, PySpark, and modern data lakehouse patterns in a fast-paced environment — this is your opportunity to build impactful solutions from the ground up.
What You’ll Actually Do Here
- Design and implement scalable, serverless data pipelines using AWS services such as Glue, Lambda, Step Functions, Athena, and S3
- Build and optimize ETL processes using Python/PySpark for efficient data transformations
- Ingest and process structured and semi-structured data formats including JSON, Parquet, and CSV
- Integrate with external APIs using REST protocols and OAuth2 authentication
- Model analytical datasets with star/snowflake schemas, implement SCD Type 2, and define partitioning strategies for efficient query performance
- Build and maintain a well-organized S3-based data lake structure to support analytics workloads
- Apply best practices for version control, data quality, and automation using GitHub workflows
- Drive technical design discussions and contribute to improving infrastructure, tooling, and data processes
What We’re Looking For
- 2+ years of experince in Data Engineering
Strong hands-on experience with AWS services — especially Glue, Lambda, S3, Step Functions, and Athena
- Solid command of Python/PySpark for building and managing ETL pipelines
- Proven track record working with large-scale datasets and optimizing performance
- Deep understanding of data modeling, transformations, and SQL for analytical use cases
- Familiarity with REST APIs, OAuth2 authentication, and data integration patterns
- Experience with Git, CI/CD, and workflow automation
- A problem-solving mindset and a passion for scalable, efficient data engineering
- Ability to work independently and collaborate across teams
Tools & Platforms
- Languages & Frameworks: Python, PySpark
- Cloud & Tools: AWS Glue, Lambda, S3, Step Functions, Athena
- Data Formats: JSON, Parquet, CSV
- Modeling & Querying: SQL, Star/Snowflake schemas
- Version Control: Git, GitHub workflows
Nice-to-Haves
- Experience with Apache Iceberg or Hudi for data lakehouse architectures
- Knowledge of data quality frameworks like Great Expectations
- Familiarity with Terraform/IaC for infrastructure automation
- Background in e-commerce or analytics platforms
- Exposure to performance tuning for large data workloads
- Hands-on with Step Functions HTTP Task
Job Type: Full-time
Education:
Experience:
- Data Engineering: 2 years (Required)
Location:
Work Location: In person