Note:
Please apply only if you're available for
walk-In interview
on
22nd November
at
Bangalore office (Manyata Tech park)
Job Title:
Data Engineer
(PySpark + Databricks)
Location:
[Insert Location]
Employment Type:
Full-time
Experience Level:
4-12 years (adjust as needed)
Role Overview
We are seeking a skilled
Data Engineer
with strong experience in
PySpark
and
Databricks
to design, develop, and optimize large-scale data pipelines and solutions. The ideal candidate will work closely with data architects, analysts, and business stakeholders to ensure efficient data processing and integration across platforms.
Key Responsibilities
-
Design and implement scalable
ETL pipelines
using
PySpark
on
Databricks
.
-
Develop and maintain data workflows for structured and unstructured data.
-
Optimize Spark jobs for performance and cost efficiency.
-
Collaborate with cross-functional teams to integrate data from multiple sources.
-
Ensure data quality, security, and compliance with organizational standards.
-
Work with
cloud platforms
(Azure/AWS/GCP) for data storage and processing.
-
Troubleshoot and resolve issues in data pipelines and workflows.
Required Skills
-
Strong proficiency in
PySpark
and
Databricks
.
-
Hands-on experience with
Spark SQL
,
Delta Lake
, and
data lake architectures
.
-
Knowledge of
cloud services
(Azure Data Lake, AWS S3, or GCP equivalent).
-
Familiarity with
CI/CD pipelines
and version control (Git).
-
Experience with
performance tuning
in Spark environments.
-
Good understanding of
data modeling
and
data warehousing concepts
.
Preferred Skills
-
Experience with
Airflow
,
Azure Data Factory
, or similar orchestration tools.
-
Knowledge of
Python
,
SQL
, and
REST APIs
.
-
Exposure to
machine learning workflows
on Databricks is a plus.
Education
-
Bachelor’s or master’s degree in computer science, Information Technology, or related field.