Skills-GCP Data Engineer
Experience: 6 to 12 years
Location: AIA-Gurgaon
- Hands-on experience with GCP services, specifically BigQuery, Cloud Storage, and Composer for data pipeline orchestration
-
Proficiency in Databricks platform with PySpark for building and optimizing large-scale ETL/ELT processes
-
Expertise in writing and tuning complex SQL queries for data transformation, aggregation, and reporting on large datasets
-
Experience integrating data from multiple sources such as APIs, cloud storage, and databases into a central data warehouse
-
Familiarity with workflow orchestration tools like Apache Airflow or Cloud Composer for scheduling, monitoring, and managing data jobs
-
Knowledge of version control systems (Git), CI/CD practices, and Agile development methodologies
Overall Responsibilities
-
Design, develop, and maintain scalable data pipelines using GCP, Pyspark, and associated tools
-
Write efficient, well-documented SQL queries to support data transformation, data quality, and reporting needs
-
Integrate data from diverse sources, including APIs, cloud storage, and databases, to create a reliable central data repository
-
Develop automated workflows and schedules for data processing tasks utilizing Composer or Airflow
-
Collaborate with data analysts, data scientists, and business stakeholders to understand data requirements and deliver solutions
-
Monitor, troubleshoot, and optimize data pipelines for performance, scalability, and reliability
-
Maintain data security, privacy standards, and documentation compliance
-
Stay informed about emerging data engineering technologies and apply them effectively to improve workflows