Job title-Data Engineer(Databricks with AWS & Python skills)
Mode:Remote
Location:Bengaluru
Primary Responsibilities
- Design, develop, and maintain data lakes and data pipelines on AWS using ETL frameworks and Databricks.
- Integrate and transform large-scale data from multiple heterogeneous sources into a centralized data lake environment.
- Implement and manage Delta Lake architecture using Databricks Delta or Apache Hudi.
- Develop end-to-end data workflows using PySpark, Databricks Notebooks, and Python scripts for ingestion, transformation, and enrichment.
- Design and develop data warehouses and data marts for analytical workloads using Snowflake, Redshift, or similar systems.
- Design and evaluate data models (Star, Snowflake, Flattened) for analytical and transactional systems.
- Optimize data storage, query performance, and cost across the AWS and Databricks ecosystem.
- Build and maintain CI/CD pipelines for Databricks notebooks, jobs, and Python-based data processing scripts.
- Collaborate with data scientists, analysts, and stakeholders to deliver high-performance, reusable data assets.
- Maintain and manage code repositories (Git) and promote best practices in version control, testing, and deployment.
- Participate in making major technical and architectural decisions for data engineering initiatives.
- Monitor and troubleshoot Databricks clusters, Spark jobs, and ETL processes for performance and reliability.
- Coordinate with business and technical teams through all phases of the software development life cycle.
You Must Have
- 5+ years of experience building and managing Data Lake Architecture on AWS Cloud
- 3+ years of experience with AWS Data services such as S3, Glue, Lake Formation, EMR, Kinesis, RDS, DMS, and Redshift.
- 3+ years of experience building Data Warehouses on Snowflake, Redshift, HANA, Teradata, or Exasol.
- 3+ years of hands-on experience working with Apache Spark or PySpark, on Databricks.
- 3+ years of experience implementing Delta Lakes using Databricks Delta or Apache Hudi.
- 3+ years of experience in ETL development using Databricks, AWS Glue, or other modern frameworks.
- Proficiency in Python for data engineering, automation, and API integrations.
- Experience in Databricks Jobs, Workflows, and Cluster Management.
- Experience with CI/CD pipelines and Infrastructure as Code (IaC) tools like Terraform or CloudFormation is a plus.
- Bachelor’s degree in computer science, Information Technology, Data Science, or related field.
- Experience working on Agile projects and methodology in general.
We Value
- Strong SQL, RDBMS, and data modeling skills.
- Experience with Databricks Unity Catalog, Delta Live Tables (DLT), and MLflow for data governance and model lifecycle.
- AWS or Databricks Cloud Certifications (e.g., AWS Data Analytics Specialty, Databricks Certified Data Engineer Professional) are a big plus.
- Understanding data security, access control, and compliance in cloud environments.
- Strong analytical, problem-solving, and communication skills.
Interested Candidates drop your updated resume at gokul@bluecloudsoftech.com or whatsapp at 8825711546
Job Type: Contractual / Temporary
Contract length: 14 months
Pay: ₹2,400,000.00 - ₹3,000,000.00 per year
Benefits:
- Provident Fund
- Work from home
Experience:
- Data Lake Architecture on AWS Cloud: 5 years (Required)
- AWS Data services such as S3, Glue, Redshift.: 3 years (Required)
- Data Warehouses on Snowflake, Redshift, HANA: 3 years (Required)
- Apache Spark or PySpark, on Databricks.: 3 years (Required)
- Delta Lakes using Databricks Delta or Apache Hudi.: 3 years (Required)
- ETL development using Databricks, AWS Glue: 3 years (Required)
Work Location: Remote