Find The RightJob.

Databricks Data Engineer with DevOps Skills

About Persistent

We are a trusted Digital Engineering and Enterprise Modernization partner, combining deep technical expertise and industry experience to help our clients anticipate what’s next. Our offerings and proven solutions create unique competitive advantage for our clients by giving them the power to see beyond and rise above.

We are experiencing tremendous growth, with $566 million in revenue in FY21, representing 12.9% year-over-year growth. Along with that growth, we onboarded over 3,000 new employees in the past year, bringing our total employee count to over 15,000 people located in 18 countries across the globe.

At Persistent, our values are more than a list of ideals to improve our corporate image. We’re dedicated to building an inclusive culture that reflects what’s important to our employees and is based on what they value. As a result, 95% of our employees approve of the CEO and 83% recommend working at Persistent to a friend.

About Position: We are looking for an experienced Databricks Data Engineer with strong DevOps expertise to join our data engineering team. The ideal candidate will design, build, and optimize large-scale pipelines on the Databricks Lakehouse Platform on AWS, while driving automated CI/CD and deployment practices. This role requires strong skills in PySpark, SQL, AWS cloud services, and modern DevOps tooling. You will collaborate closely with cross-functional teams to deliver scalable, secure, and high-performance data solutions.

Role: Databricks Data Engineer with DevOps Skills

Location: Los Angeles CA (Hybrid)

Hire Type: Fulltime /Contract

Experience: 8+ years of experience

What You'll Do:

Designing and implementing Databricks-based Lakehouse architectures on AWS
Clear separation of compute vs. serving layers
Ability to design low-latency data/API access strategies (beyond Spark-only patterns)
Strong understanding of caching strategies for performance and cost optimization
Data partitioning, storage optimization, and file layout strategy
Ability to handle multi-terabyte structured or time-series datasets
Skill in requirement probing, identifying what matters architecturally
A player-coach mindset: hands-on engineering + technical leadership
Strong hands-on experience with Databricks, including:
Delta Lake ,Unity Catalog ,Lakehouse Architecture ,Delta Live Pipelines ,Databricks Runtime ,Table Triggers ,Databricks Workflows
Proficiency in PySpark, Spark, and advanced SQL.
Expertise with AWS cloud services, including:S3 ,IAM Glue / Glue Catalog ,Lambda Kinesis (optional but beneficial)Secrets Manager
Strong understanding of DevOps tools:Git / GitLabCI/CD pipelines
Databricks Asset Bundles
Familiarity with Terraform is a plus.
Experience with relational databases and data warehouse concepts.

Expertise You'll Bring:

Design, build, and maintain scalable ETL/ELT pipelines using Databricks on AWS.
Develop high-performance data processing workflows using PySpark/Spark and SQL.
Integrate data from Amazon S3, relational databases, and semi/non‑structured sources.
Implement Delta Lake best practices including schema evolution, ACID, OPTIMIZE, ZORDER, partitioning, and file-size tuning.
Ensure architectures support high-volume, multi-terabyte workloads.

2. DevOps & CI/CD

Implement CI/CD pipelines for Databricks using Git, GitLab, GitHub Actions, or AWS-native tools.
Build and manage automated deployments using Databricks Asset Bundles.
Manage version control for notebooks, workflows, libraries, and environment configuration.
Automate cluster policies, job creation, environment provisioning, and configuration management.
Support infrastructure-as-code via Terraform (preferred) or CloudFormation.

3. Collaboration & Business Support

Work with data analysts and BI teams to prepare curated datasets for reporting and analytics.
Collaborate closely with product owners, engineering teams, and business partners to translate requirements into scalable implementations.
Document data flows, technical architecture, and DevOps/deployment workflows.

4. Performance & Optimization

Tune Spark clusters, workflows, and queries for cost efficiency and compute performance.
Monitor pipelines, troubleshoot failures, and maintain high reliability.
Implement logging, monitoring, and observability across workflows and jobs.
Apply caching strategies and workload optimization techniques to support low-latency consumption patterns.

5. Governance & Security

Implement and maintain data governance using Unity Catalog.
Enforce access controls, security policies, and data compliance requirements.
Ensure lineage, quality checks, and auditability across data flows.

Benefits:

Competitive salary and benefits package
Culture focused on talent development with quarterly promotion cycles and company-sponsored higher education and certifications
Opportunity to work with cutting-edge technologies
Employee engagement initiatives such as project parties, flexible work hours, and ‘Long Service’ awards
Annual health check-ups as well as insurance:
Group term life insurance
Personal accident insurance
Mediclaim hospitalization insurance for self, spouse, two children, and parents

Why Persistent is an employer of choice

Technology Innovation: culture of innovation using cutting-edge technology to bring value to clients.
Growth and Career Progression: learning opportunities for growth, including quarterly promotion cycles.
One Persistent Culture: global outlook with diversity and inclusion at its core.
Mental and Physical Wellness: employee health and mindfulness programs

Similar jobs

No similar jobs found

Term of use Privacy policy