Key Responsibilities:
- Build & Deploy AI Solutions: Implement end-to-end AI systems (LLMs, computer vision, voice transcription, multimodal AI) using Python, PyTorch/TensorFlow, and Docker/Kubernetes.
- Optimize AI Models: Fine-tune/train LLMs (e.g., GPT-4, Llama 2, RAG pipelines), vision models (CNNs, ViTs), and speech models (Whisper, VITS) for low-latency inference.
- AI Pipeline Engineering: Design scalable data preprocessing, training, and serving pipelines (e.g., Ray, Kubeflow, Airflow).
- Edge/Cloud Deployment: Containerize models (Docker) and deploy on K8s, AWS SageMaker, or edge devices.
- Performance Tuning: Benchmark and optimize models for GPU/TPU acceleration (CUDA, TensorRT).
Required Skills:
- 5+ years in Python AI development (not just research—production experience required).
- Hands-on with LLMs (LangChain, Hugging Face), computer vision (OpenCV, YOLO), and voice AI (ASR, TTS).
- Strong MLOps skills: Docker, CI/CD for AI, model registries (MLflow, Weights & Biases).
- Experience with distributed training (FSDP, DeepSpeed, Horovod).
Nice-to-Have:
- NVIDIA Triton Inference Server, ONNX Runtime.
- Quantization/pruning for model optimization.
- CUDA-level performance debugging.
Important: Should have transferable Iqama.
Job Type: Full-time
Application Question(s):
- Please confirm do you have transferable Iqama?