FIND_THE_RIGHTJOB.

PySak Tech Lead

India

Job Title: PySpark Tech Lead
Location: Remote – Occasional visit Mumbai
Employment Type: Full-Time
Experience Level: 8+ years

About the Role

We are seeking an experienced PySpark Tech Lead to design, develop, and optimize large-scale data processing solutions using Apache Spark and Python. The ideal candidate will lead a team of data engineers, drive best practices in big data development, and collaborate with cross-functional teams to build scalable, high-performance data pipelines for analytics and business insights.

Key Responsibilities

Lead the design, architecture, and implementation of end-to-end data processing and ETL pipelines using PySpark.
Work closely with data architects, data scientists, and business stakeholders to translate requirements into technical solutions.
Optimize Spark jobs for performance, scalability, and cost efficiency in distributed environments.
Define and enforce coding standards, version control, and deployment best practices across the team.
Mentor and guide junior engineers, conduct code reviews, and foster a culture of technical excellence.
Collaborate with DevOps teams to manage data infrastructure, including cluster configuration, monitoring, and troubleshooting.
Drive the adoption of modern data engineering tools and frameworks to improve productivity and reliability.
Ensure data quality, governance, and compliance in all developed solutions.

Required Skills & Qualifications

8+ years of professional experience in data engineering or big data development.
3+ years of hands-on experience in PySpark, including Spark SQL, DataFrames, and RDDs.
Strong programming skills in Python, with experience in building modular and testable code.
Deep understanding of distributed computing concepts and Spark internals (partitions, shuffling, caching, etc.).
Experience with data ingestion and integration from multiple sources (RDBMS, APIs, Kafka, etc.).
Strong proficiency with SQL and experience working on data warehouses or data lakes (e.g., Delta Lake, Hive, Snowflake).
Experience deploying Spark workloads on cloud platforms such as AWS EMR, Azure Databricks, or GCP Dataproc.
Solid understanding of CI/CD pipelines, Git, and containerization (Docker/Kubernetes).
Excellent problem-solving, communication, and leadership skills.

Nice to Have

Experience with Airflow, NiFi, or other orchestration tools.
Knowledge of Scala Spark or Java Spark.
Familiarity with data streaming frameworks (Kafka Streams, Spark Streaming, Flink).
Exposure to machine learning pipelines or feature engineering workflows.
Understanding of data governance, metadata management, and data catalog tools.

Education

Bachelor’s or Master’s degree in Computer Science, Information Technology, or a related field.

Why Join Us

Opportunity to lead complex data initiatives and shape the organization’s data ecosystem.
Collaborative environment that values innovation, learning, and technical excellence.
Work with cutting-edge big data and cloud technologies in large-scale production environments.

Job Type: Full-time

Pay: ₹812,640.62 - ₹2,099,692.05 per year

Work Location: In person

Similar jobs