Python AI/ML Data Engineer

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

Freelancing Opportunity

Hrs Required : 5 hrs Daily

Job Title: Python AI/ML Data Engineer

About the Role

We are looking for a Python AI/ML Data Engineer to build intelligent automation solutions for Contract Lifecycle Management (CLM). This role focuses on applying unsupervised machine learning, NLP, and fuzzy matching techniques to process, classify, and match large volumes of unstructured contract data. You will work on scalable data pipelines that reduce manual effort and improve accuracy through human-in-the-loop AI systems.

Key ResponsibilitiesAI & Machine Learning

Design and implement unsupervised ML models using scikit-learn to cluster and profile contract templates.
Perform similarity analysis using techniques such as TF-IDF, cosine similarity, or distance-based models.
Develop fuzzy matching logic (RapidFuzz / fuzzywuzzy) to reconcile extracted contract attributes with historical back-office data.
Continuously refine models using feedback from manual review and stewardship teams.

Python Automation & Data Engineering

Build Python scripts to parse, transform, and validate complex nested JSON contract data.
Extract contract attributes (products, pricing, clauses, renewal terms, dates) using Regex and NLP techniques.
Create batch-processing pipelines to handle 20–50+ products per run with deduplication and data integrity checks.
Flatten unstructured data into structured, analysis-ready formats (CSV / Excel / relational tables).

Data Validation & Quality Control

Automatically flag data gaps, inconsistencies, and anomaly cases for manual review.
Generate validation reports and exception logs for business and compliance teams.

Reporting & Visualization

Prepare structured outputs for dashboards tracking coverage %, match confidence, and model accuracy.
Support reporting integrations with Power BI / Google Data Studio / Looker.

Required Skills & Qualifications

Strong proficiency in Python.
Hands-on experience with scikit-learn for clustering and similarity modeling.
Advanced usage of pandas, numpy, and Python data pipelines.
Strong knowledge of Regex and text processing.
Experience parsing and transforming JSON and semi-structured data.
Practical experience implementing fuzzy string matching in real-world use cases.

Preferred Qualifications

Experience with NLP libraries such as spaCy, NLTK, or similar.
Background in Contract Lifecycle Management (CLM), legal documents, or compliance data.
Experience with feature engineering for unstructured text.
Familiarity with human-in-the-loop or semi-automated AI systems.

Tech Stack

Programming: Python 3.x
ML & Data: scikit-learn, pandas, numpy
Text & Matching: Regex, RapidFuzz / fuzzywuzzy
Data Formats: JSON, CSV, Excel
Tools: Git, Jupyter Notebook, Power BI / Looker

Job Types: Part-time, Freelance
Contract length: 12 months

Pay: ₹40,000.00 - ₹50,000.00 per month

Benefits:

Work from home

Experience:

total work: 6 years (Required)

Shift availability:

Night Shift (Required)
Overnight Shift (Required)

Work Location: Remote

Similar jobs

Data Scientist

Philips

India

6 days ago

Big Data Python Engineer

INFLUX SERVICE

India

6 days ago

Term of use Privacy policy