Responsibilities
- Design & Develop Data Pipelines: Build, optimize, and maintain ETL/ELT pipelines to ingest, transform, and process large volumes of structured and unstructured data from diverse sources. Leverage Azure Data Factory, Azure Data Lake, Azure Synapse Analytics, Microsoft Fabric, or Databricks for scalable data integration and transformation.
- Data Modeling & Architecture: Design and implement data models and schemas optimized for analytics, machine learning, and AI-driven decision-making. Create and manage data warehouses, data lakes, and lakehouses to support data analytics and AI workloads.
- Data Governance & Security: Ensure data quality, compliance, and security by implementing governance frameworks and leveraging tools like Microsoft Purview. Enforce data security protocols, including role-based access control, data masking, and encryption.
- AI/ML Data Integration: Collaborate with data scientists to integrate data pipelines with machine learning workflows, enabling seamless training and deployment of models.
- Performance Optimization: Monitor and optimize the performance of data pipelines and cloud resources to ensure high availability, scalability, and cost efficiency.
Secondary Responsibilities
- Build AI Scoring Engines: Develop and implement AI scoring engines to automate decision-making processes, such as fraud detection, customer segmentation, and recommendation systems.
- Data Preparation for AI/ML Models: Partner with data scientists to prepare and preprocess data for machine learning models, including handling missing values, scaling, and feature engineering.
- AI/ML Model Development: Contribute to the development of predictive and prescriptive models using frameworks like scikit-learn, TensorFlow, PyTorch, and Azure ML Studio.
- Model Deployment and Integration: Deploy machine learning models and scoring engines to production environments using Azure Machine Learning, integrating real-time and batch workflows.
- MLOps Implementation: Build and maintain MLOps pipelines for versioning, monitoring, and retraining AI models in production, ensuring continuous improvement.
- AI-Driven Insights: Support the integration of AI models with business applications to deliver actionable insights and improve operational efficiency.