Overview:
We are seeking an Associate Data Engineer to support our data engineering team in developing data processing and analytics solutions using big modern data technologies.
Responsibilities:
- Assist in developing data processing pipelines using Python and basic Spark operations
-
Support ETL workflow development on AWS EMR clusters
-
Learn and apply Spark optimization techniques under guidance
-
Help maintain data exploration tools using Hue and other platforms
-
Support data scientists and analysts with data pipeline requirements
-
Participate in monitoring distributed computing environments
Qualifications:
- 1-3 years of software development experience
-
Basic proficiency in Python & SQL
-
Understanding of database concepts and ETL fundamentals
-
Exposure to big data technologies and cloud platforms (AWS preferred)
-
Basic knowledge of Apache Spark or willingness to learn quickly
-
Familiarity with AWS services or cloud computing concepts
-
Interest in data processing and analytics
-
Strong problem-solving and learning abilities.
-
Academic or project experience with data streaming technologies
-
Basic understanding of data security principles
-
Exposure to machine learning concepts through coursework or projects
-
Interest in pursuing AWS certifications
-
Familiarity with version control systems (Git)
-
Basic understanding of containerization concepts
-
Exposure to agile development methodologies