Job Purpose
As a Data Engineer at SAAL.ai, you will be responsible for designing, building, and operating scalable and resilient data platforms that enable high-impact analytics, insights, and decision-making. You will work closely with product, analytics, and engineering teams to prototype, deliver, and operate reliable data pipelines and big-data services in production environments.
Key Responsibilities
-
Define, design, and develop services for large-scale data ingestion, storage, and management across relational databases, NoSQL systems, log files, and event streams.
-
Design, build, and operate scalable batch and streaming data pipelines in production environments.
-
Ensure data platforms meet performance, reliability, scalability, and low-latency requirements.
-
Develop highly concurrent systems for processing structured and unstructured datasets.
-
Design and implement workflows and pipelines using tools such as Apache NiFi, Kafka, Spark, Flink, and Airflow.
-
Support both batch and real-time data processing use cases.
-
Collaborate with internal product teams and third-party service providers on system integrations and platform enhancements.
-
Participate in sprint planning, development, and delivery activities.
-
Ensure solutions are deployable, observable, and operationally manageable in live production environments.
Educational Qualification
Bachelor’s degree in Computer Science, Software Engineering, or a related field is preferred.
Experience
Minimum 5 years of professional experience as a Data Engineer or Big Data Engineer.
Essential Skills
-
Strong experience in designing and operating data pipelines using Apache NiFi, Kafka, Spark, Flink, and Airflow.
-
Solid understanding of distributed data architectures, data modelling, OLAP design, and change data capture (CDC), including tools such as Debezium.
-
Experience with large-scale storage and messaging platforms, including MinIO/S3 and Kafka.
-
Proficient in batch and real-time data processing, with familiarity in data formats such as Avro, Parquet, and ORC.
-
Strong scripting or programming skills.
-
Experience working in Agile environments and using version control systems such as Git.
-
Experience delivering enterprise-scale analytics and reporting platforms, strong understanding of infrastructure and container-based deployments, and the ability to advise on emerging data technologies and best practices.