Position: AI Data Scientist & Annotation Expert
Location: On-site, Night Shift, Bahria Phase 4 Rawalpindi
Employment Type: Full-time
About the Role
We are looking for a skilled AI Data Scientist & Annotation Expert to drive the full lifecycle of AI/ML solutions — from data collection and annotation through advanced model development, training, and deployment. The role combines hands-on annotation expertise (critical for building high-quality training datasets) with data science and machine learning skills to ensure robust, production-ready AI systems.
You will play a central role in curating high-quality datasets, experimenting with algorithms, and ensuring model performance by leveraging both advanced data science techniques and precise annotation workflows.
Key ResponsibilitiesData Science & Machine Learning
- Design, build, and evaluate machine learning/deep learning models for NLP, Computer Vision, or multimodal tasks.
- Perform feature engineering, exploratory data analysis (EDA), and statistical modeling.
- Experiment with different model architectures (CNNs, RNNs, Transformers, LLMs) and optimize hyperparameters for maximum performance.
- Develop and deploy scalable ML pipelines in production (using FastAPI, TensorFlow Serving, TorchServe, or containerized environments).
- Collaborate with engineers and product teams to integrate models into end-user applications.
Data Annotation & Curation
- Design annotation workflows and guidelines for various data types (images, PDFs, audio, video, and text).
- Perform and manage manual and semi-automated annotation tasks (e.g., bounding boxes, polygons, segmentation masks, labeling, and text tagging).
- Implement quality control measures to ensure annotations meet accuracy and consistency standards.
- Train and supervise annotation teams, providing feedback and ensuring adherence to project requirements.
- Work with annotation tools (e.g., Label Studio, CVAT, Supervisely, Prodigy, Labelbox).
- Assist in building gold-standard datasets that serve as the foundation for AI/ML training.
MLOps & Data Ops
- Set up pipelines for dataset versioning, experiment tracking, and model reproducibility (using MLflow, DVC, or similar).
- Monitor dataset drift, class imbalances, and annotation consistency to avoid bias in models.
- Collaborate with DevOps teams for deployment, scaling, and monitoring of AI models.
Required Qualifications
- Bachelor’s or Master’s degree in Computer Science, Data Science, AI/ML, or a related field.
- Strong programming skills in Python and familiarity with ML frameworks (TensorFlow, PyTorch, Scikit-learn, Hugging Face).
- Solid understanding of machine learning algorithms, deep learning architectures, and statistical methods.
- Hands-on experience with data annotation platforms (Label Studio, CVAT, or similar).
- Familiarity with SQL/NoSQL databases and big data handling.
- Strong understanding of data preprocessing (cleaning, augmentation, normalization, and transformation).
Preferred Skills
- Prior experience in annotation team management or data labeling quality assurance.
- Knowledge of vector databases (pgvector, Pinecone, Weaviate, Chroma) for AI-driven search and RAG pipelines.
- Experience with MLOps tools (Airflow, Kubeflow, MLflow).
- Knowledge of annotation automation using weak supervision, active learning, or semi-supervised learning techniques.
- Understanding of ethical AI practices, bias detection, and responsible dataset curation.
Soft Skills
- Exceptional attention to detail and accuracy in handling data.
- Ability to train, mentor, and guide annotation teams.
- Strong analytical and problem-solving skills.
- Effective communication for both technical and non-technical audiences.
- Self-driven, proactive, and collaborative mindset.
Job Type: Full-time
Application Question(s):
- are you comfortable working night shift?
- Are you comfortable commuting to Bahria phase 4 Rawalpindi?
Work Location: In person