Qureos

FIND_THE_RIGHTJOB.

PySpark Developer

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

Job Title: PySpark Developer

Locations: Chennai, Hyderabad, Kolkata
Work Mode: Monday–Friday (5 days WFO)
Experience: 5+ years in Backend/Data Engineering
Notice Period: Immediate – 15 days
Must-Have: Python, PySpark, Amazon Redshift, PostgreSQL

About the Role

We are seeking an experienced PySpark Developer with strong data engineering expertise to design, develop, and optimize scalable data pipelines for large-scale data processing. The role involves working across distributed systems, ETL/ELT frameworks, cloud data platforms, and analytics-driven architecture. You will collaborate closely with cross-functional teams to ensure efficient ingestion, transformation, and delivery of high-quality data.

Key Responsibilities

  • Design and develop robust, scalable ETL/ELT pipelines using PySpark to process data from databases, APIs, logs, and file-based sources.
  • Convert raw data into analysis-ready datasets for data hubs and analytical data marts.
  • Build reusable, parameterized Spark jobs for batch and micro-batch processing.
  • Optimize PySpark performance to handle large and complex datasets.
  • Ensure data quality, consistency, lineage, and maintain detailed documentation for all ingestion workflows.
  • Collaborate with Data Architects, Data Modelers, and Data Scientists to implement data ingestion logic aligned with business requirements.
  • Work with AWS services (S3, Glue, EMR, Redshift) for data ingestion, storage, and processing.
  • Support version control, CI/CD practices, and infrastructure-as-code workflows as needed.

Must-Have Skills

  • Minimum 5+ years of data engineering experience, with a strong focus on PySpark/Spark.
  • Proven experience building ingestion frameworks for relational, semi-structured (JSON, XML), and unstructured data (logs, PDFs).
  • Strong Python knowledge along with key data processing libraries.
  • Advanced SQL proficiency (Redshift, PostgreSQL, or similar).
  • Hands-on experience with distributed computing platforms (Spark on EMR, Databricks, etc.).
  • Familiarity with workflow orchestration tools (AWS Step Functions or similar).
  • Strong understanding of data lake and data warehouse architectures, including core data modeling concepts.

Good-to-Have Skills

  • Experience with AWS services: Glue, S3, Redshift, Lambda, CloudWatch, etc.
  • Exposure to Delta Lake or similar large-scale storage frameworks.
  • Experience with real-time streaming tools: Spark Structured Streaming, Kafka.
  • Understanding of data governance, lineage, and cataloging tools (Glue Catalog, Apache Atlas).
  • Knowledge of DevOps and CI/CD pipelines (Git, Jenkins, etc.).

Job Type: Full-time

Pay: ₹1,400,000.00 - ₹1,800,000.00 per year

Application Question(s):

  • How many years of experience you have as Pyspark Developer?
  • Have you worked with Python, Amazon Redshift, PostgreSQL?
  • Mention your Current Location?
  • Mention your NP, Current CTC and ECTC.

Work Location: In person

© 2025 Qureos. All rights reserved.