FIND_THE_RIGHTJOB.
JOB_REQUIREMENTS
Hires in
Not specified
Employment Type
Not specified
Company Location
Not specified
Salary
Not specified
We are looking for a skilled Data Engineer with hands-on experience in building, optimizing, and maintaining large-scale data pipelines on the Cloudera Hadoop platform. The candidate should have strong expertise in distributed systems, ETL development, data ingestion, and cluster operations.
Responsibilities
Design, build, and maintain scalable data pipelines using Hadoop ecosystem tools (HDFS, Hive, Impala, Spark, Kafka, Sqoop, Oozie).
Manage and optimize Cloudera clusters, including configuration, upgrades, performance tuning, and security.
Develop and optimize Spark (PySpark/Scala) applications for batch and real-time processing.
Configure and maintain Cloudera Manager, YARN, and Hive Metastore.
Build data ingestion processes from various sources such as RDBMS, APIs, streaming systems, and cloud storage.
Work closely with Data Architects to implement data models, governance, quality, and lineage.
Implement best practices for data security, Kerberos, Ranger, and Sentry.
Monitor, troubleshoot, and resolve production issues, ensuring high availability.
Collaborate with analytics, BI, and product teams for data delivery and automation.
Strong experience with Cloudera CDH/CDP distributions.
Hands-on expertise with Hadoop tools:
HDFS, Hive, Impala, Spark, Kafka, Sqoop, Oozie, Zookeeper.
Solid programming skills in Python / Scala / Java.
Strong SQL experience and ability to optimize complex queries.
Experience with ETL pipelines, data warehousing concepts, and data lake architectures.
Knowledge of Linux, shell scripting, and DevOps tools (Git, Jenkins).
Experience working with cloud environments (AWS/GCP/Azure) is a plus.
Good understanding of distributed computing and big data performance optimization.
© 2025 Qureos. All rights reserved.