Overview:
We are seeking a talented and motivated GCP Data Engineer with experience in real-time streaming technologies, specifically Apache Kafka, and proficiency in Java. The ideal candidate will be responsible for designing, developing, and maintaining data pipelines and streaming applications in a cloud environment.
Responsibilities:
-
Data Pipeline Development: Design and implement robust data pipelines using Google Cloud services (BigQuery, Dataflow, Pub/Sub) integrated with Kafka for real-time data processing.
-
Streaming Applications: Develop and maintain real-time streaming applications using Kafka and Java, ensuring high availability and performance.
-
Cloud Infrastructure: Utilize GCP technologies to build scalable and efficient data architectures, leveraging tools such as Cloud Storage, Cloud Functions, and Kubernetes.
-
Monitoring and Optimization: Monitor data flow, performance metrics, and troubleshoot issues in production environments. Optimize data pipeline performance and reliability.
-
Collaboration: Work closely with data scientists, analysts, and other engineers to understand data requirements and translate them into technical specifications.
-
Documentation: Maintain comprehensive documentation of data pipelines, architecture, and processes to ensure knowledge sharing and compliance.
Requirements:
-
Cloud Skills: Proficiency in Google Cloud Platform (GCP), including services like BigQuery, Dataflow, and Pub/Sub.
-
Streaming Technologies: Strong experience with Kafka for real-time data streaming and messaging.
-
Programming Language: Proficient in Java, with a solid understanding of object-oriented programming principles and best practices.
-
Data Management: Familiarity with data modeling, ETL processes, and database management.
-
Development Tools: Experience with version control systems (e.g., Git), CI/CD tools, and Agile methodologies.
-
Problem-Solving: Strong analytical and troubleshooting skills, with the ability to work independently and as part of a team.
-
Bachelor's degree in Computer Science, Information Technology, or a related field.
-
Experience with other programming languages (Python, Scala) is a plus.
-
Knowledge of additional GCP services or data visualization tools (e.g., Looker, Tableau) is advantageous.