Senior AI/ML Engineer (Remote) Pakistan Only
Role Summary
We are seeking a skilled AI/ML Engineer to design, build, and deploy generative AI solutions using Python, Hugging Face, and Azure cloud services (Azure is our stack, but strong AWS/GCP experience is welcome). You will own the full lifecycle of AI/ML model development from prototyping through production, with strong emphasis on operational excellence, cloud troubleshooting, and rapid iteration. This role requires someone who can diagnose production issues quickly, work across Python and .NET ecosystems, and deliver measurable results within tight timelines.
This is a fully remote position for candidates based in Pakistan, working with our Canadian team on cutting edge AI products.
Key Responsibilities
AI/ML Development and Productionization
- Design and implement preprocessing pipelines, inference services, and evaluation frameworks using Python and Hugging Face transformers
- Fine tune, optimize, and deploy large language models and generative AI models to production environments on Azure
- Build and maintain retrieval augmented generation (RAG) systems integrating vector databases such as Azure AI Search or CosmosDB with LLM workflows
- Develop robust unit and integration tests for ML pipelines ensuring reliability before production deployment
- Implement prompt engineering strategies and optimize context management for production LLM applications
Cloud Infrastructure and Operations
- Containerize ML models using Docker and deploy to Azure Container Instances, Azure App Service, or Azure Kubernetes Service
- Implement health check endpoints, structured logging, and telemetry instrumentation for all deployed services
- Configure and monitor Azure Application Insights dashboards to track model performance, latency, error rates, and resource utilization
- Troubleshoot production incidents by analyzing Application Insights logs, distributed traces, and request telemetry to identify root causes and implement fixes
- Optimize inference latency and cloud costs through caching strategies, model quantization, and efficient resource allocation
MLOps and CI/CD
- Build and maintain CI/CD pipelines using Azure DevOps or GitHub Actions for automated testing, model validation, and deployment
- Implement model versioning, A/B testing infrastructure, and rollback procedures for production ML services
- Manage containerized deployments and orchestration for scalable inference workloads
Cross Platform Integration
- Work with existing .NET API layer to integrate AI services using Azure SDKs
- Collaborate with .NET developers to ensure seamless integration between Python ML services and .NET applications
- Follow established patterns for dependency injection and service registration in .NET codebases
Collaboration and Documentation
- Partner with product, engineering, and DevOps teams to align AI initiatives with business objectives
- Document architectures, runbooks, and troubleshooting guides to enable team knowledge sharing
- Communicate technical concepts clearly to both technical and non-technical stakeholders
Required Skills and Experience
Python and ML Engineering (Primary)
- 3+ years of production Python development with demonstrated ability to write clean, testable, and maintainable code
- Hands on experience with Hugging Face transformers including model loading, tokenization, fine tuning, and inference optimization
- Ability to implement a data preprocessing pipeline and wrap a Hugging Face model for inference with unit tests
- Strong proficiency with PyTorch, scikit learn, pandas, and numpy
- Experience building REST APIs using FastAPI or Flask for ML model serving
Cloud Operations (Azure preferred, AWS/GCP Transferable)
- While we build on Azure we value strong conceptual knowledge of cloud operations. Experience with equivalent services in AWS (ECS, CloudWatch, Lambda) or GCP is fully acceptable provided you are willing to cross train.
- 2+ years deploying and operating workloads on Azure including Container Instances, App Service, or AKS
- Demonstrated proficiency in Azure Application Insights including custom metrics, log queries using KQL, distributed tracing, and alert configuration
- Ability to containerize a Hugging Face model, deploy it to Azure, expose health and logging endpoints, and demonstrate telemetry in Application Insights
- Proven ability to efficiently analyze failing request traces and Application Insights logs to rapidly identify root causes and propose fixes in a production environment.
- Experience with Azure DevOps or GitHub Actions for CI/CD pipelines
GenAI and LLM Integration
- Hands on experience integrating Azure OpenAI Service, OpenAI APIs, or open source LLMs into production applications
- Practical experience implementing RAG architectures with vector databases and embedding models
- Understanding of prompt engineering techniques, context management, token optimization, and LLM output parsing
- Understanding of semantic contextual relevance for improving search accuracy and response quality in AI applications
- Familiarity with chunking strategies, semantic search, and hybrid retrieval approaches
Preferred Qualifications
- Experience with LangChain, LlamaIndex, or similar LLM orchestration frameworks
- Familiarity with Semantic Kernel for .NET and Python
- Experience with model quantization (ONNX, TensorRT) for inference optimization
- Knowledge of evaluation frameworks for LLM applications (RAGAS, custom metrics)
- Prior experience working with distributed teams across time zones
Azure .NET SDK Proficiency
Our AI services integrate with an existing .NET API layer. You will need to work with both Python and C# codebases.
Required .NET Skills
- Dependency Injection: Register and consume services using Microsoft.Extensions.DependencyInjection (same pattern as FastAPI/Flask dependency injection)
- Azure SDKs: Use Azure.AI.OpenAI, Azure.Search.Documents, Azure.Storage.Blobs, and related packages (identical patterns to Python Azure SDKs)
- Polly: Configure retry policies and resilience patterns for HTTP calls
- LINQ: Write query expressions for data manipulation (similar to Python list comprehensions)
- Standard Libraries: Use System.Text.Json, HttpClient, and common .NET packages for API development
What is NOT Required
- Entity Framework or any ORM patterns
- Complex .NET architecture patterns
- Prior .NET production experience