We are looking for an experienced
Java + Apache Spark Develope
r with 6+ years of hands-on experience in building scalable, high-performance data processing applications. The ideal candidate should have strong expertise in Java, distributed data processing, and big data ecosystems.
Key Responsibilities
-
Design, develop, and maintain scalable data processing applications using
Java and Apache Spark
-
Develop batch and real-time data pipelines
-
Optimize Spark jobs for performance, scalability, and reliability
-
Work with large datasets using distributed computing frameworks
-
Integrate Spark applications with Hadoop ecosystem tools (HDFS, Hive, etc.)
-
Collaborate with data engineers, analysts, and cross-functional teams
-
Troubleshoot production issues and ensure high availability
-
Follow best practices in coding, testing, and deployment
Required Skills & Qualifications
-
5+ years of experience in
Java development
-
Strong hands-on experience with
Apache Spark (Core, SQL, DataFrames, Datasets)
-
Good understanding of
Hadoop ecosystem (HDFS, Hive, YARN)
-
Experience in writing complex SQL queries
-
Knowledge of distributed systems and parallel processing
-
Experience with REST APIs and microservices architecture
-
Familiarity with version control systems (Git)
-
Strong debugging and performance tuning skills
Preferred Skills
-
Experience with
Spark Streaming / Structured Streaming
-
Knowledge of
Kafka
for real-time data processing
-
Cloud experience (AWS / Azure / GCP)
-
Experience with Docker/Kubernetes
-
CI/CD pipeline exposure
-
Understanding of data warehousing concepts
Education
-
Bachelor’s or Master’s degree in Computer Science, Engineering, or related field
Soft Skills
-
Strong problem-solving skills
-
Good communication and stakeholder interaction
-
Ability to work in Agile/Scrum environment