🔹
Job Responsibilities
-
Design, build & maintain
scalable data pipelines
for ingestion, processing & storage.
-
Collaborate 🤝 with
data scientists, analysts, and product teams
to deliver high-quality data solutions.
-
Optimize data systems for
performance, reliability, scalability, and cost-efficiency
.
-
Implement
data quality checks
✅ ensuring accuracy, completeness, and consistency.
-
Work with
structured & unstructured data
📊 from diverse sources.
-
Develop & maintain
data models, metadata, and documentation
.
-
Automate & monitor workflows using tools like
Apache Airflow
(or similar).
-
Ensure
data governance & security best practices
are followed.
🔹
Required Skills & Qualifications
-
Bachelor’s/Master’s 🎓degree in
Computer Science, Engineering, or related field
.
-
3–5 years
of experience in
data engineering, ETL development, or backend data systems
.
-
Proficiency in 💻
SQL & Python/Scala
.
-
Experience with 🛠️
big data tools
(Spark, Hadoop, Kafka, etc.).
-
Hands-on with ☁️
cloud data platforms
(AWS Redshift, GCP BigQuery, Azure Data Lake).
-
Familiar with
orchestration tools
(Airflow, Luigi, etc.).
-
Experience with
data warehousing & data modeling
.
-
Strong
analytical & problem-solving skills
; ability to work independently & in teams.
🔹
Preferred Qualifications
-
🐳 Experience with
containerization
(Docker, Kubernetes).
-
🔄 Knowledge of
CI/CD processes
&
Git
version control.
-
📜 Understanding of
data privacy regulations
(GDPR, CCPA, etc.).
-
🤖 Exposure to
machine learning pipelines / MLOps
is a plus.