Senior Data Engineer

Senior data engineer

Senior Data Engineer: strongest on Glue + Spark + Iceberg + streaming/batch + Redshift

14 Weeks
Note – consultants should be able to travel with in US to Greenville site once every month.

Must-have: Strong Glue, ETL, and Spark skills; experience with Iceberg; ability to work with both batch and real-time or streaming ingestion; familiarity with Kinesis and/or MSK; and understanding of Redshift and data warehouse patterns.
Nice-to-have: Working knowledge of DynamoDB for metadata-related use cases, broader awareness of AWS streaming options, and prototype-building capability to support architecture validation.

Position Overview

We are seeking a Senior Data Engineer to design and build the data pipelines, data products, and integration flows for the GE Vernova MIDA engagement. This role involves hands-on pipeline architecture, data quality validation, and building the foundational data products that enable the Industrial Data Mesh.

Key Responsibilities

Technical Leadership

Design and architect batch and streaming data pipelines for industrial data
Define data product schemas, contracts, and quality validation rules
Implement data integration patterns (CDC, event-driven, pub/sub) across OT and IT systems
Design schema evolution strategies using Avro, Parquet, and Apache Iceberg
Optimize pipeline performance and cost efficiency

Customer Engagement

Collaborate with GE Vernova data teams to understand existing data flows
Support data domain workshops with technical pipeline feasibility input
Present pipeline design recommendations to customer engineering teams

Solution Development

Build production-ready data pipelines on AWS infrastructure
Implement data quality validation and enrichment at ingestion
Develop automated testing for data products
Create infrastructure-as-code for pipeline deployment

Qualifications

Experience

5-7 years in data engineering or ETL/ELT development
Experience with large-scale streaming and batch data processing
Experience in manufacturing or industrial data environments preferred

Technical Skills (AWS Services, would consider competitive alternatives)

AWS Glue (ETL, Spark, Iceberg)
Amazon Kinesis Data Streams / Firehose
Amazon MSK (Kafka)
Amazon Athena, Amazon Redshift
AWS Step Functions, AWS Lambda
Amazon S3 (partitioning, lifecycle management)
Amazon DynamoDB (metadata/state)
AWS CDK / CloudFormation
Programming: Python, PySpark, SQL

Soft Skills

Strong problem-solving and analytical skills
Ability to work collaboratively with architects and customer teams
Experience in agile environments

AWS Certifications (Nice to have)

AWS Certified Data Analytics — Specialty
AWS Certified Solutions Architect — Associate

Pay: $65.00 - $70.00 per hour

Experience:

Glue + Spark + Iceberg + streaming/batch + Redshift: 4 years (Required)
data engineering or ETL/ELT development: 8 years (Required)

License/Certification:

AWS Certified Data Analytics – Specialty (Required)
AWS Certified Solutions Architect – Associate (Required)

Work Location: Remote

Similar jobs

No similar jobs found

Term of use Privacy policy