Key Responsibilities
-
Design, develop, and support our Big Data platforms, including Kafka, Hadoop, Dremio, and related technologies.
-
Build, deploy, and oversee robust data processing pipelines using Java, Python, Spark, and Flink.
-
Collaborate with development teams on data modeling, ingestion approaches, and capacity planning.
-
Work closely with users to maintain data accuracy, consistency, and availability across systems.
-
Act as a Big Data subject-matter expert, providing guidance on technical questions and best practices.
Required Skills & Experience
-
5+ years of experience in a mature data engineering environment.
-
3+ years of hands‑on experience developing Kafka streaming applications or managing Kafka clusters.
-
2+ years building applications or pipelines on Big Data architectures (S3, HDFS, Databricks, Iceberg, etc.).
-
Proficiency with Apache Spark, Apache Flink, or comparable data processing tools.
-
Strong programming capabilities in Java, Python, and SQL.
-
Experience with widely used Python-based data science libraries.
-
Practical knowledge of Kubernetes and Docker.
-
Familiarity with monitoring tools such as Prometheus/Grafana, AlertManager, Alerta, and OpsGenie.
-
Strong background in statistical analysis.
-
Demonstrated ability to troubleshoot and perform root‑cause analysis.
-
Experience writing Unix scripts (bash, Python, etc.).