Who We Are:
We are a dynamic team focused on building innovative and scalable data solutions on Amazon Web Services (AWS). Our AWS Data Engineer will play a key role in designing, building, and maintaining scalable data pipelines and infrastructure, ensuring data availability, accuracy, and performance for business insights and machine learning models.The ideal candidate will bridge the gap between AWS infrastructure and Snowflake Data Cloud, ensuring a seamless flow of data for analytics.
What We Are Looking For:
We are seeking a highly skilled and experienced AWS Data Engineer who will be responsible for developing and managing data pipelines on AWS. The ideal candidate will bring strong expertise in AWS data services, big data processing, and data modeling to help us provide high-performance data solutions. Experience in optimizing Snowflake performance and managing Snowflake objects within an AWS environment is critical.
Responsibilities:
Data Pipeline Development and Management:
- Design and automate end-to-end data ingestion from APIs, databases, and third-party sources into AWS S3 using Glue, Kinesis, and Lambda.
- Develop ETL/ELT processes to process and transform large volumes of structured and unstructured data.
- Optimize data pipeline performance, scalability, and reliability.
- Ensure data processing and ingestion workflows are monitored and meet performance SLAs.
Data Storage and Management:
- Design and implement data storage solutions using Amazon S3, Redshift, DynamoDB, and RDS.
- Architect and manage the 'S3-to-Snowflake' bridge using Snowpipe and External Tables to ensure near real-time data availability.
- Optimize data structures, partitioning, and indexing for performance and cost efficiency.
- Ensure data security, integrity, and availability across different AWS storage solutions.
- Manage data lifecycle policies and archiving processes.
Data Transformation and Processing:
- Develop data transformation processes using AWS Glue, EMR, and Athena.
- Implement data quality checks, validation rules, and monitoring solutions.
- Support real-time and batch data processing needs.
- Develop and manage complex data transformations using dbt and Snowflake SQL, ensuring code modularity and version control.
Data Integration and Automation:
- Integrate data from multiple sources, including APIs, databases, and third-party applications.
- Ensure data consistency across different environments and systems.
Collaboration and Stakeholder Engagement:
- Work closely with data scientists and analysts to understand data needs and business goals.
- Provide technical guidance and best practices to the data engineering and business teams.
- Collaborate with security and compliance teams to ensure data governance standards are met.
Performance Monitoring and Troubleshooting:
- Monitor data pipeline performance and troubleshoot issues in real-time.
- Analyze data pipeline failures and implement fixes to prevent recurrence.
- Set up logging and monitoring using CloudWatch and X-Ray.
Qualifications:
- Bachelor’s degree in Computer Science, Data Engineering, or a related field; Master’s degree is a plus.
- 5+ years of experience in data engineering, with a focus on AWS and modern cloud data warehouses (Snowflake experience required).
- AWS Certified Data Analytics – Specialty or AWS Certified Big Data – Specialty certification is required.
- Strong proficiency with AWS services such as S3, Redshift, Glue, Athena, Kinesis, Lambda, and Step Functions.
- Hands-on experience with big data tools and frameworks such as Apache Spark, Hadoop, and Flink.
- Proficiency in dbt (Core or Cloud) for managing transformation workflows, with a strong understanding of dbt testing and documentation frameworks.
- Proficiency in programming languages such as Python, Java, or Scala.
- Strong knowledge of SQL, data modeling, and query optimization.
- Experience with CI/CD tools and version control (e.g., Git, CodePipeline).
- Strong understanding of data governance, security, and compliance requirements.
- Ability to manage large-scale data processing and real-time data pipelines.
- Excellent problem-solving, analytical, and communication skills.
Preferred Skills:
- Experience with data lake architecture and implementation on AWS.
- Familiarity with Infrastructure as Code (IaC) tools such as CloudFormation and Terraform.
- Experience with NoSQL databases (e.g., DynamoDB) and key-value stores.
- Knowledge of containerization and orchestration using ECS and EKS.
- Snowflake Core Certification (Pro) is a significant plus.
- Experience with dbt-package management and integrating dbt with AWS-based orchestration tools like Airflow or Step Functions.
What We Offer:
- Competitive salary and performance-based incentives.
- Comprehensive health, dental, and vision coverage.
- 401(k) with company matching.
- Professional development and training opportunities (including AWS certification).
- Flexible work environment and remote work options.
Join us and be part of a team building innovative and scalable data solutions on Amazon Web Services!
This position is open to multiple engagement models, including Permanent/Full-Time, Contract, or Corp-to-Corp (C2C) arrangements. We are looking for the best talent and are flexible on the employment structure for the right candidate.
We cannot work with third-party agencies at this time. Resumes submitted via unapproved agencies will be automatically rejected.