FIND_THE_RIGHTJOB.
Hyderabad, Pakistan
Working Hours : Full Time
Locations : Hyderabad
Experience : 4 –6 years
Soothsayer Analytics is a global AI & Data Science consultancy headquartered in Detroit, with a thriving delivery center in Hyderabad. We design and deploy end-to-end custom Machine Learning & GenAI solutions—spanning predictive analytics, optimization, NLP, and enterprise-scale AI platforms—that help leading enterprises forecast, automate, and gain a competitive edge.
As a Data Engineer, you will build the foundation that powers these AI systems—scalable, secure, and high-performance data pipelines.
Job Overview
We seek a Data Engineer (Mid-level) with 4–6 years of hands-on experience in designing, building, and optimizing data pipelines. You will work closely with AI/ML teams to ensure data availability, quality, and performance for analytics and GenAI use cases.
Key Responsibilities
Data Pipeline Development:
· Build and maintain scalable ETL/ELT pipelines for structured and unstructured data
· Ingest data from diverse sources (APIs, streaming, batch systems).
Data Modeling & Warehousing
· Design efficient data models to support analytics and AI workloads.
· Develop and optimize data warehouses/lakes using Redshift, BigQuery, Snowflake, or Delta Lake.
Big Data & Streaming
· Work with distributed systems like Apache Spark, Kafka, or Flink for real-time/large-scale data processing.
· Manage feature stores for ML pipelines
Collaboration & Best Practices
· Work closely with Data Scientists and ML Engineers to ensure high-quality training data.
· Implement data quality checks, observability, and governance frameworks.
Required Skills & Qualifications
Education:Bachelor’s/Master’s in Computer Science, Data Engineering, or related field.
Experience: 4–6 years in data engineering with expertise in:
· Programming: Python/Scala/Java (Python preferred).
· Big Data & Processing: Apache Spark, Kafka, Hadoop.
· Databases: SQL/NoSQL (Postgres, MongoDB, Cassandra).
· Data Warehousing: Snowflake, Redshift, BigQuery, or similar.
· Orchestration: Airflow, Luigi, or similar.
· Cloud Platforms: AWS, Azure, or GCP (data services).
· Version Control & CI/CD: Git, Jenkins, GitHub Actions.
· MLOps/GenAI pipelines: (feature engineering, embeddings, vector DBs)
Skills Matrix
Candidates must submit a detailed resume and fill out the following matrix:
Skill
Details
Skills Last Used
Experience (months)
Self-Rating (0–10)
Python
SQL / NoSQL
Apache Spark
Kafka
Data Warehousing (Snowflake, Redshift, etc.)
Orchestration (Airflow, Luigi, etc.)
Cloud (AWS / Azure / GCP)
Data Quality / Governance Tools
MLOps / LLMOps
GenAI Integration
Instructions for Candidates:
· Provide a detailed resume highlighting end-to-end data engineering projects.
· Fill out the above skills matrix with accurate dates, duration, and self-ratings.
Similar jobs
Qualitest
Hyderabad, Pakistan
about 11 hours ago
Oaktree Capital Management
Hyderabad, Pakistan
about 11 hours ago
UST
Hyderabad, Pakistan
about 12 hours ago
UST
Hyderabad, Pakistan
about 12 hours ago
Egen Solutions Inc
Hyderabad, Pakistan
about 12 hours ago
DigitalOcean
Hyderabad, Pakistan
6 days ago
UST
Hyderabad, Pakistan
6 days ago
© 2025 Qureos. All rights reserved.