About the Role
We are seeking a
Cloudera Developer
to design, build, and optimize data pipelines and processing jobs on our Cloudera platform. The role involves leveraging
NiFi, Spark, Hive, and Impala
to deliver high-quality, reliable data flows. Experience with enterprise data integration platforms (e.g., TIBCO) is a plus but not mandatory.
Key Responsibilities
-
Develop, test, and maintain data ingestion flows using
Apache NiFi
.
-
Build large-scale data transformation jobs with
Apache Spark
(Scala or PySpark).
-
Implement
API-based integrations
and data exchanges with enterprise platforms.
-
Ensure data
security, quality, and compliance
using governance tools (Ranger, Atlas, SDX).
-
Monitor, troubleshoot, and optimize
data pipelines
for performance and reliability.
-
Contribute to
automation and CI/CD
processes for data pipelines.
-
Document
data flows, schemas, and transformations
for ongoing support.
Required Skills & Experience
-
4–5 years
of experience in
data engineering/development
on Cloudera/CDP platforms.
-
Hands-on expertise with
Apache NiFi, Spark (Scala or PySpark), Hive, and Impala
.
-
Knowledge of
data governance and security
(Ranger, Atlas, SDX).
-
Experience integrating with external platforms via
REST APIs or message queues
.
-
Familiarity with
Git
and CI/CD pipelines.
-
Strong ability to
troubleshoot
data flow, cluster, and performance issues.
Preferred Qualifications
-
Experience with
TIBCO Data Hub/Data Science
or other enterprise data integration platforms.
-
Cloudera certifications
(CCA or Cloudera Data Platform).
-
Exposure to
cloud-native
or hybrid data platforms.
-
Knowledge of
containerization
(Docker/Kubernetes) for data workloads.