Job Purpose
To lead the development and management of robust data infrastructure and pipelines across the CBG network, enabling the delivery of data as a product. This role ensures high-quality, accessible, and timely data is available for advanced analytics, reporting, and decision-making across all operational and business functions.
Key Responsibilities
Data Infrastructure Development:
-
Design, implement, and manage scalable data architecture for collecting, storing, and processing large volumes of data from CBG plant systems (SCADA, PLC, IoT devices, SAP, EMS, LIMS, etc.).
-
Own the cloud/on-prem data lake, data warehouse, and structured databases supporting both real-time and batch processing.
Pipeline Engineering & Orchestration
-
Develop and maintain robust, automated data pipelines using modern ETL/ELT tools
-
Ensure reliability, efficiency, and monitoring of all data flows from source to destination systems.
Data Quality & Governance
-
Implement processes to ensure data accuracy, consistency, completeness, and freshness.
-
Work with Data Governance and Compliance teams to define standards, validation rules, and audit trails.
Cross-Functional Collaboration
-
Collaborate with data scientists, business analysts, application teams, and plant operations to understand and prioritize data requirements.
-
Enable self-service data access through APIs, secure dashboards, and curated datasets.
Metadata & Cataloguing
-
Maintain a data catalogue and lineage tracking system to improve data discoverability and reusability across the organization.
-
Provide documentation and training on data schema, usage, and access policies.
Security & Compliance
-
Ensure data is stored and accessed securely, following best practices in encryption, role-based access, and regulatory compliance.
Key Skills
-
B.E. / B.Tech
-
Expertise in SQL, Python, DevOps and distributed data technologies (e.g., Spark, Kafka).
-
Experience with cloud platforms such as AWS, Azure, or GCP, and associated data services Strong understanding of CI/CD for data pipelines and MLOps integration.
-
Familiarity with industrial data sources (OPC-UA, MQTT, SCADA systems) is highly desirable.
-
Excellent leadership, documentation, and stakeholder communication skills.