FIND_THE_RIGHTJOB.

Soothsayer Analytics

Data Engineer

Hyderabad, Pakistan

Working Hours : Full Time

Locations : Hyderabad

Experience : 4 –6 years

About the Role:

Soothsayer Analytics is a global AI & Data Science consultancy headquartered in Detroit, with a thriving delivery center in Hyderabad. We design and deploy end-to-end custom Machine Learning & GenAI solutions—spanning predictive analytics, optimization, NLP, and enterprise-scale AI platforms—that help leading enterprises forecast, automate, and gain a competitive edge.

As a Data Engineer, you will build the foundation that powers these AI systems—scalable, secure, and high-performance data pipelines.

Job Overview

We seek a Data Engineer (Mid-level) with 4–6 years of hands-on experience in designing, building, and optimizing data pipelines. You will work closely with AI/ML teams to ensure data availability, quality, and performance for analytics and GenAI use cases.

Key Responsibilities

Data Pipeline Development:

· Build and maintain scalable ETL/ELT pipelines for structured and unstructured data

· Ingest data from diverse sources (APIs, streaming, batch systems).

Data Modeling & Warehousing

· Design efficient data models to support analytics and AI workloads.

· Develop and optimize data warehouses/lakes using Redshift, BigQuery, Snowflake, or Delta Lake.

Big Data & Streaming

· Work with distributed systems like Apache Spark, Kafka, or Flink for real-time/large-scale data processing.

· Manage feature stores for ML pipelines

Collaboration & Best Practices

· Work closely with Data Scientists and ML Engineers to ensure high-quality training data.

· Implement data quality checks, observability, and governance frameworks.

Required Skills & Qualifications

Education:Bachelor’s/Master’s in Computer Science, Data Engineering, or related field.

Experience: 4–6 years in data engineering with expertise in:

· Programming: Python/Scala/Java (Python preferred).

· Big Data & Processing: Apache Spark, Kafka, Hadoop.

· Databases: SQL/NoSQL (Postgres, MongoDB, Cassandra).

· Data Warehousing: Snowflake, Redshift, BigQuery, or similar.

· Orchestration: Airflow, Luigi, or similar.

· Cloud Platforms: AWS, Azure, or GCP (data services).

· Version Control & CI/CD: Git, Jenkins, GitHub Actions.

· MLOps/GenAI pipelines: (feature engineering, embeddings, vector DBs)

Skills Matrix

Candidates must submit a detailed resume and fill out the following matrix:

Skill

Details

Skills Last Used

Experience (months)

Self-Rating (0–10)

Python

SQL / NoSQL

Apache Spark

Kafka

Data Warehousing (Snowflake, Redshift, etc.)

Orchestration (Airflow, Luigi, etc.)

Cloud (AWS / Azure / GCP)

Data Quality / Governance Tools

MLOps / LLMOps

GenAI Integration

Instructions for Candidates:

· Provide a detailed resume highlighting end-to-end data engineering projects.

· Fill out the above skills matrix with accurate dates, duration, and self-ratings.

Similar jobs

Senior Data Analyst

Qualitest

Hyderabad, Pakistan

about 11 hours ago

Senior Associate - Portfolio Construction and Risk Management

Oaktree Capital Management

Hyderabad, Pakistan

about 11 hours ago

Lead I - Software Engineering - Python

UST

Hyderabad, Pakistan

about 12 hours ago

Lead I - Oracle SQL Developer

UST

Hyderabad, Pakistan

about 12 hours ago

Lead Data Engineer – Python & GCP

Egen Solutions Inc

Hyderabad, Pakistan

about 12 hours ago

Staff Product Data Analyst

DigitalOcean

Hyderabad, Pakistan

6 days ago

Lead I - Big Data Engineer AWS

UST

Hyderabad, Pakistan

6 days ago

Term of use Privacy policy