Qureos

FIND_THE_RIGHTJOB.

Pyspark Developer

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

Job Title : PySpark Developer

Location : Chennai, Hyderabad, Kolkata

Work Mode : Monday - Friday (5 days WFO)

Experience : 5+ Years in Backend Development

Notice Period : Immediate to 15 days

Must-Have Experience : Python, PySpark, Amazon Redshift, PostgreSQL

About the Role :

We are looking for an experienced PySpark Developer with strong data engineering capabilities to design, develop, and optimize scalable data pipelines for large-scale data processing. The ideal candidate must possess in-depth knowledge of PySpark, SQL, and cloud-based data ecosystems, along with strong problem-solving skills and the ability to work with cross-functional teams.

Roles & Responsibilities :

- Design and develop robust, scalable ETL/ELT pipelines using PySpark to process data from various sources such as databases, APIs, logs, and files.

- Transform raw data into analysis-ready datasets for data hubs and analytical data marts.

- Build reusable, parameterized Spark jobs for batch and micro-batch processing.

- Optimize PySpark job performance to handle large and complex datasets efficiently.

- Ensure data quality, consistency, and lineage, and maintain thorough documentation across

all ingestion workflows.

- Collaborate with Data Architects, Data Modelers, and Data Scientists to implement ingestion

logic aligned with business requirements.

- Work with AWS-based data platforms (S3, Glue, EMR, Redshift) for data movement and

storage.

- Support version control, CI/CD processes, and infrastructure-as-code practices as required.

Must-Have Skills :

- Minimum 5+ years of data engineering experience, with a strong focus on PySpark/Spark.

- Proven experience building data pipelines and ingestion frameworks for relational, semi-

structured (JSON, XML), and unstructured data (logs, PDFs).

- Strong knowledge of Python and related data processing libraries.

- Advanced SQL proficiency (Amazon Redshift, PostgreSQL or similar).

- Hands-on expertise with distributed computing frameworks such as Spark on EMR or

Databricks.

- Familiarity with workflow orchestration tools like AWS Step Functions or similar.

- Good understanding of data lake and data warehouse architectures, including fundamental

data modeling concepts.

Good-to-Have Skills :

- Experience with AWS data services : Glue, S3, Redshift, Lambda, CloudWatch.

- Exposure to Delta Lake or similar large-scale storage technologies.

- Experience with real-time streaming tools such as Spark Structured Streaming or Kafka.

- Understanding of data governance, lineage, and cataloging tools (AWS Glue Catalog, Apache

Atlas).

- Knowledge of DevOps/CI-CD pipelines using Git, Jenkins.

Job Type: Full-time

Pay: ₹1,500,000.00 - ₹2,000,000.00 per year

Application Question(s):

  • We are hiring for this position immediately. Are you available to join within 30 days? If not, please mention your official notice period or your last working day
  • How many years of experience do you have with PySpark?
  • How many years of experience do you have with Amazon Redshift?
  • How many years of hands-on experience do you have with ETL/ELT pipeline development?
  • What is your current location?
  • Are you comfortable working from office (WFO) Monday–Friday in Chennai/Hyderabad/Kolkata?
  • What is your current CTC? What is your expected CTC? Do you have any offers in hand?

Work Location: In person

© 2025 Qureos. All rights reserved.