AWS Pyspark Lead

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

Bangalore, Karnataka, India;Hyderabad, Telangana, India;Noida, Uttar Pradesh, India;Pune, Maharashtra, India;Indore, Madhya Pradesh, India

Qualification

Location: All locations in India
Experience: 8–12 Years
Employment Type: Full-Time
Department: Data Engineering / Cloud Data Solutions

Job Summary:

We are looking for a highly skilled and motivated Data Engineer with 8–12 years of hands-on experience to join our growing data engineering team. The ideal candidate will have a strong ground in AWS cloud services, Python programming, and big data technologies like Spark and SQL. You will play a key role in designing, building, and optimizing scalable data pipelines and analytics solutions to support business insights and decision-making.

Key Responsibilities:

Design, develop, and maintain robust and scalable data pipelines using AWS services such as Glue, Lambda, Kinesis, Step Functions, Athena, and DynamoDB.
Write efficient and reusable Python scripts for data transformation, automation, and orchestration.
Work with Spark to process large datasets and optimize data workflows.
Develop complex SQL queries for data extraction, validation, and reporting purposes.
Collaborate with data scientists, analysts, and cross-functional teams to understand data requirements and deliver end-to-end solutions.
Ensure best practices around IAM, S3 data management, and secure data exchange using SNS/SQS.
Monitor pipeline performance and troubleshoot data issues to ensure high availability and reliability.
Document technical solutions, data flows, and architectural decisions.

Required Skills & Qualifications:

8–12 years of experience in data engineering or related field.
Strong hands-on expertise with AWS services, particularly:
- Glue, Lambda, Kinesis, Step Functions, S3, DynamoDB, Athena, IAM, SNS, SQS.
Proficient in Python for scripting and automation.
Experience with Apache Spark for big data processing.
Strong knowledge of SQL and working with relational and non-relational databases.
Solid understanding of data architecture, data integration, and ETL best practices.
Ability to work in a fast-paced, collaborative environment and deliver high-quality solutions.

Preferred Qualifications:

AWS Certification (e.g., AWS Certified Data Analytics or Solutions Architect) is a plus.
Experience with CI/CD pipelines, infrastructure-as-code (Terraform/CloudFormation), or monitoring tools is an advantage.
Familiarity with data lake and real-time streaming architecture.

Experience

8 to 12 years

Job Reference Number

13520

Skills Required

AWS, Pyspark, Spark

Role

Location: All locations in India
Experience: 8–12 Years
Employment Type: Full-Time
Department: Data Engineering / Cloud Data Solutions

Job Summary:

Key Responsibilities:

Design, develop, and maintain robust and scalable data pipelines using AWS services such as Glue, Lambda, Kinesis, Step Functions, Athena, and DynamoDB.
Write efficient and reusable Python scripts for data transformation, automation, and orchestration.
Work with Spark to process large datasets and optimize data workflows.
Develop complex SQL queries for data extraction, validation, and reporting purposes.
Collaborate with data scientists, analysts, and cross-functional teams to understand data requirements and deliver end-to-end solutions.
Ensure best practices around IAM, S3 data management, and secure data exchange using SNS/SQS.
Monitor pipeline performance and troubleshoot data issues to ensure high availability and reliability.
Document technical solutions, data flows, and architectural decisions.

Required Skills & Qualifications:

8–12 years of experience in data engineering or related field.
Strong hands-on expertise with AWS services, particularly:
- Glue, Lambda, Kinesis, Step Functions, S3, DynamoDB, Athena, IAM, SNS, SQS.
Proficient in Python for scripting and automation.
Experience with Apache Spark for big data processing.
Strong knowledge of SQL and working with relational and non-relational databases.
Solid understanding of data architecture, data integration, and ETL best practices.
Ability to work in a fast-paced, collaborative environment and deliver high-quality solutions.

Preferred Qualifications:

AWS Certification (e.g., AWS Certified Data Analytics or Solutions Architect) is a plus.
Experience with CI/CD pipelines, infrastructure-as-code (Terraform/CloudFormation), or monitoring tools is an advantage.
Familiarity with data lake and real-time streaming architecture.

Similar jobs

No similar jobs found

Term of use Privacy policy