Overview:
The objectives of a Data Engineer typically revolve around designing, implementing, and managing robust data architectures and infrastructures. The Data Engineer will play a key role in driving the organization's data strategy, ensuring the efficient flow and storage of data, and supporting data-driven decision-making processes.
Duties & Responsibilities:
-
Develop and maintain a scalable and efficient data models and schemas that support the storage, processing, and retrieval of data.
-
Implement robust ELT (Extract, Load, Transform) processes to integrate data from various sources into a unified and consistent format.
-
Ensure seamless data flow between different systems, databases, and applications.
-
Implement data validation and cleansing processes to ensure the accuracy and reliability of data.
-
Stay abreast of emerging technologies and trends in the field of data engineering.
-
Manage and optimize database systems, including relational databases (e.g., MySQL, PostgreSQL) and NoSQL databases (e.g., MongoDB, Cassandra).
-
Performance Optimization:
-
Identify and address performance bottlenecks in data processing and storage systems.
-
Optimize queries, indexing strategies, and data partitioning for improved performance.
-
Scalability and Reliability:
-
Ensure that data systems are scalable to handle growing volumes of data.
-
Implement redundancy and fault-tolerance measures to enhance data system reliability.
Skills Required:
Mandatory Skills Required:
-
Strong hands-on expertise with AWS services such as S3, Glue, Lambda, Athena, Redshift, Step Functions, Streaming tools like Apache Kafka \ Amazon Kinesis.
- Proficiency in Python, PySpark, experience with ELT and orchestration tools
-
Strong hands-on expertise on Apache Hudi \ Apache Iceberg
-
Strong hands-on expertise In SQL
-
Experience using GitHub for version control and collaboration
-
Develop, optimize, and maintain ELT processes for data integration.
-
Strong communication skills to interface with technical and business stakeholders.
Secondary Skills Required:
- Knowledge of cloud infrastructure automation (e.g., Terraform, CloudFormation).
-
Experience with DevOps CICD pipelines and source code management.
-
AWS Certification, experience in platform automation and monitoring frameworks.
Qualifications Required:
-
Bachelor’s degree (B.A.) from four-year college or university, or equivalent combination of education and experience.
- 7+ years of experience in data engineering or a related role
-
Proficient in data modeling, ELT development, and database management.
-
Experience with big data technologies and distributed computing frameworks