FIND_THE_RIGHTJOB.
We are seeking a skilled Data Engineer to design, build, and optimize scalable data pipelines and cloud-based data platforms. The role involves working with large-scale batch and real-time data processing systems, collaborating with cross-functional teams, and ensuring data reliability, security, and performance across the data lifecycle.
Key Responsibilities
Design, develop, and maintain complex end-to-end ETL pipelines for large-scale data ingestion and processing.
Optimize data pipelines for performance, scalability, fault tolerance, and reliability.
Develop and optimize batch and real-time data processing solutions using Apache Spark (PySpark/Scala) and Apache Kafka.
Ensure fault-tolerant, scalable, and high-performance data processing systems.
Build and manage scalable, cloud-native data infrastructure on AWS.
Design resilient and cost-efficient data pipelines adaptable to varying data volume and formats.
Enable seamless ingestion and processing of real-time streaming and batch data sources (e.g., AWS MSK).
Ensure consistency, data quality, and a unified view across multiple data sources and formats.
Partner with business teams and data scientists to understand data requirements.
Perform in-depth data analysis to identify trends, patterns, and anomalies.
Deliver high-quality datasets and present actionable insights to stakeholders.
Implement and maintain CI/CD pipelines using Jenkins or similar tools.
Automate testing, deployment, and monitoring to ensure smooth production releases.
Collaborate with security teams to ensure compliance with organizational and regulatory standards (e.g., GDPR, HIPAA).
Implement data governance practices ensuring data integrity, security, and traceability.
Identify and resolve performance bottlenecks in data pipelines.
Apply best practices for monitoring, tuning, and optimizing data ingestion and storage.
Work closely with engineers, data scientists, product managers, and business stakeholders.
Participate in agile ceremonies, sprint planning, and architectural discussions.
Skills & Qualifications
AWS Expertise
Hands-on experience with AWS Big Data services such as EMR, Managed Apache Airflow, Glue, S3, DMS, MSK, and EC2.
Strong understanding of cloud-native data architectures.
Big Data Technologies
Proficiency in PySpark or Scala Spark and SQL for large-scale data transformation and analysis.
Experience with Apache Spark and Apache Kafka in production environments.
Data Frameworks
Strong knowledge of Spark DataFrames and Datasets.
ETL Pipeline Development
Proven experience in building scalable and reliable ETL pipelines for both batch and real-time data processing.
Database Modeling & Data Warehousing
Expertise in designing scalable data models for OLAP and OLTP systems.
Data Analysis & Insights
Ability to perform complex data analysis and extract actionable business insights.
Strong analytical and problem-solving skills with a data-driven mindset.
CI/CD & Automation
Basic to intermediate experience with CI/CD pipelines using Jenkins or similar tools.
Familiarity with automated testing and deployment workflows.
Knowledge of Java for data processing applications.
Experience with NoSQL databases (e.g., DynamoDB, Cassandra, MongoDB).
Familiarity with data governance frameworks and compliance tooling.
Experience with monitoring and observability tools such as AWS CloudWatch, Splunk, or Dynatrace.
Exposure to cost optimization strategies for large-scale cloud data platforms.
big data,scala spark,apache spark,etl pipeline development,
Similar jobs
Genesys
Hyderabad, Pakistan
9 days ago
Blue Yonder
Hyderabad, Pakistan
9 days ago
UST
Hyderabad, Pakistan
9 days ago
Egen Solutions Inc
Hyderabad, Pakistan
9 days ago
Tata Consultancy Services (TCS)
Hyderabad, Pakistan
9 days ago
Tata Consultancy Services (TCS)
Hyderabad, Pakistan
9 days ago
Tata Consultancy Services (TCS)
Hyderabad, Pakistan
9 days ago
© 2026 Qureos. All rights reserved.