Freelancing Opportunity
Hrs Required : 5 hrs Daily
Job Title: Python AI/ML Data Engineer
About the Role
We are looking for a Python AI/ML Data Engineer to build intelligent automation solutions for Contract Lifecycle Management (CLM). This role focuses on applying unsupervised machine learning, NLP, and fuzzy matching techniques to process, classify, and match large volumes of unstructured contract data. You will work on scalable data pipelines that reduce manual effort and improve accuracy through human-in-the-loop AI systems.
Key ResponsibilitiesAI & Machine Learning
- Design and implement unsupervised ML models using scikit-learn to cluster and profile contract templates.
- Perform similarity analysis using techniques such as TF-IDF, cosine similarity, or distance-based models.
- Develop fuzzy matching logic (RapidFuzz / fuzzywuzzy) to reconcile extracted contract attributes with historical back-office data.
- Continuously refine models using feedback from manual review and stewardship teams.
Python Automation & Data Engineering
- Build Python scripts to parse, transform, and validate complex nested JSON contract data.
- Extract contract attributes (products, pricing, clauses, renewal terms, dates) using Regex and NLP techniques.
- Create batch-processing pipelines to handle 20–50+ products per run with deduplication and data integrity checks.
- Flatten unstructured data into structured, analysis-ready formats (CSV / Excel / relational tables).
Data Validation & Quality Control
- Automatically flag data gaps, inconsistencies, and anomaly cases for manual review.
- Generate validation reports and exception logs for business and compliance teams.
Reporting & Visualization
- Prepare structured outputs for dashboards tracking coverage %, match confidence, and model accuracy.
- Support reporting integrations with Power BI / Google Data Studio / Looker.
Required Skills & Qualifications
- Strong proficiency in Python.
- Hands-on experience with scikit-learn for clustering and similarity modeling.
- Advanced usage of pandas, numpy, and Python data pipelines.
- Strong knowledge of Regex and text processing.
- Experience parsing and transforming JSON and semi-structured data.
- Practical experience implementing fuzzy string matching in real-world use cases.
Preferred Qualifications
- Experience with NLP libraries such as spaCy, NLTK, or similar.
- Background in Contract Lifecycle Management (CLM), legal documents, or compliance data.
- Experience with feature engineering for unstructured text.
- Familiarity with human-in-the-loop or semi-automated AI systems.
Tech Stack
- Programming: Python 3.x
- ML & Data: scikit-learn, pandas, numpy
- Text & Matching: Regex, RapidFuzz / fuzzywuzzy
- Data Formats: JSON, CSV, Excel
- Tools: Git, Jupyter Notebook, Power BI / Looker
Job Types: Part-time, Freelance
Contract length: 12 months
Pay: ₹40,000.00 - ₹50,000.00 per month
Benefits:
Experience:
- total work: 6 years (Required)
Shift availability:
- Night Shift (Required)
- Overnight Shift (Required)
Work Location: Remote