Data Engineer

JOB_REQUIREMENTS

Hires in

Not specified

Employment Type

Not specified

Company Location

Not specified

Salary

Not specified

Taraki is hiring for one of its clients.

Location: Remote

Experience Level: 5 to 8 years in Data Engineering, Data Pipelines, and Cloud-based Data Platforms

Department: Data & AI Engineering

Compensation: PKR 600,000 to 850,000 (based on experience)

Role Summary:

The Data Engineer will design and build large-scale, high-performance data pipelines to support segmentation, pricing simulation, and offer decisioning. They will ensure efficient data ingestion from telco systems (CDRs, usage, recharge, offer purchase), transformation, and integration with ML models and orchestration modules.

Key Responsibilities:

Design and develop scalable ETL / ELT data pipelines to process 50M+ customer records daily.
Ingest data from OCS, CRM, DWH, and Adobe RT-CDP or other customer data platforms.
Build and maintain Customer Profile Store and Feature Store for real-time and batch processing.
Implement data validation, quality, and lineage frameworks.
Optimize query performance and cost efficiency for batch and streaming workloads.
Collaborate with Data Scientists to prepare model training datasets and deploy inference pipelines.
Integrate outputs with Decision Engine and Real-Time Offer Orchestration Module.
Automate pipelines using CI/CD and maintain environment configurations across Dev, UAT, and Prod.

Required Skills

Strong in SQL, PySpark, and DataFrame APIs for data transformation.
Expertise in Data Modeling (customer-level, event-level, offer-level).
Understanding of data partitioning, schema evolution, and performance tuning.
Experience in stream processing (Kafka, Spark Streaming, Kinesis).
Knowledge of data quality frameworks (e.g., Great Expectations, Deequ).
Familiarity with ETL orchestration tools (Airflow, dbt, or Dagster).
Ability to work with cloud-native data platforms and object storage.

Tools & Technologies

Data Platform: Databricks, AWS Glue, Azure Data Factory, Snowflake, BigQuery

Streaming: Kafka, Kinesis, Spark Streaming

Storage: S3, Delta Lake, Parquet, Hive

Workflow Orchestration: Airflow, dbt, Dagster, Prefect

Scripting: Python, SQL, PySpark

DevOps: Git, Jenkins, Terraform

Monitoring & Validation: Great Expectations, Deequ, DataDog

Preferred (Nice-to-Have)

Experience with telecom datasets (Recharge, Usage, Balance, Offer Subscription).
Knowledge of DecisionRules.io, n8n, or KNIME for orchestration workflows.
Familiarity with Adobe AEP data schemas (XDM) or Pricefx integration.
Exposure to real-time microservices (REST/GraphQL APIs) for data access.

Similar jobs

Senior Data Scientist

Expert System Solution

Lahore, Pakistan

about 3 hours ago

Data Infrastructure Engineer

MonetizeMore

Lahore, Pakistan

about 6 hours ago

Data Analyst

HR Ways

Lahore, Pakistan

about 6 hours ago

Term of use Privacy policy