We are looking for a skilled Data Engineer with strong expertise in PySpark and Data Modeling to join our Data & Analytics team. The ideal candidate will be responsible for building scalable data pipelines, optimizing data workflows, and supporting advanced analytics initiatives.
Key Responsibilities-
Design, develop, and maintain scalable data pipelines using PySpark
-
Perform data modeling (conceptual, logical, and physical) for analytics and reporting
-
Build and optimize ETL/ELT workflows for large-scale datasets
-
Work with structured and unstructured data across multiple sources
-
Ensure data quality, integrity, and governance standards
-
Collaborate with data analysts, scientists, and business stakeholders
-
Optimize performance of Spark jobs and data processing systems
-
Support deployment and monitoring of data solutions in production
Required Skills & Qualifications-
Strong experience in PySpark and Apache Spark ecosystem
-
Hands-on experience in data modeling (Star Schema, Snowflake, etc.)
-
Proficiency in SQL and database technologies
-
Experience with data warehousing concepts
-
Knowledge of ETL/ELT tools and frameworks
-
Familiarity with cloud platforms (AWS / Azure / GCP) is a plus
-
Understanding of big data technologies (Hadoop, Hive, Kafka, etc.)
-
Strong problem-solving and analytical skills
Preferred Qualifications-
Experience in banking/financial services domain
-
Exposure to data governance and data quality frameworks
-
Knowledge of CI/CD pipelines and DevOps practices