About the Role
  
  As a
  
   Quantitative Data Engineer
  
  , you will be the backbone of the data ecosystem powering our
  
   quantitative research, trading, and AI-driven strategies
  
  . You will design, build, and maintain the
  
   high-performance data infrastructure
  
  that enables low-latency, high-fidelity access to market, fundamental, and alternative data across multiple asset classes.
 
  This role bridges
  
   quant engineering, data systems, and research enablement
  
  , ensuring that our researchers and traders have fast, reliable, and well-documented datasets for analysis and live trading. You’ll be part of a cross-functional team working at the intersection of
  
   finance, machine learning, and distributed systems
  
  .
 
   Responsibilities
  
- 
    Architect and maintain scalable ETL pipelines
   
   for ingesting and transforming terabytes of structured, semi-structured, and unstructured market and alternative data.
  
- 
    Design time-series optimized data stores
   
   and
   
    streaming frameworks
   
   to support low-latency data access for both backtesting and live trading.
  
- 
    Develop ingestion frameworks
   
   integrating vendor feeds (Bloomberg, Refinitiv, Polygon, Quandl, etc.), exchange data, and internal execution systems.
  
- 
    Collaborate with quantitative researchers and ML teams
   
   to ensure data accuracy, feature availability, and schema evolution aligned with modeling needs.
  
- 
    Implement data quality checks, validation pipelines, and version control mechanisms
   
   for all datasets.
  
- 
    Monitor and optimize distributed compute environments
   
   (Spark, Flink, Ray, or Dask) for performance and cost efficiency.
  
- 
    Automate workflows
   
   using orchestration tools (Airflow, Prefect, Dagster) for reliability and reproducibility.
  
- 
    Establish best practices
   
   for metadata management, lineage tracking, and documentation.
  
- 
    Contribute to internal libraries and SDKs
   
   for seamless data access by trading and research applications.
  
   In Trading Firms, Data Engineers Typically:
  
- 
   Build
   
    real-time data streaming systems
   
   to capture market ticks, order books, and execution signals.
  
- 
   Manage
   
    versioned historical data lakes
   
   for backtesting and model training.
  
- 
   Handle
   
    multi-venue data normalization
   
   (different exchanges and instruments).
  
- 
   Integrate
   
    alternative datasets
   
   (satellite imagery, news sentiment, ESG, supply-chain data).
  
- 
   Work closely with
   
    quant researchers
   
   to convert raw data into
   
    research-ready features
   
   .
  
- 
   Optimize pipelines for
   
    ultra-low latency
   
   where milliseconds can impact P&L.
  
- 
   Implement
   
    data observability frameworks
   
   to ensure uptime and quality.
  
- 
   Collaborate with
   
    DevOps and infra engineers
   
   to scale storage, caching, and compute.
  
   Tech Stack
  
- 
    Languages:
   
   Python, SQL, Scala, Go, Rust (optional for HFT pipelines)
  
- 
    Data Processing:
   
   Apache Spark, Flink, Ray, Dask, Pandas, Polars
  
- 
    Workflow Orchestration:
   
   Apache Airflow, Prefect, Dagster
  
- 
    Databases & Storage:
   
   PostgreSQL, ClickHouse, DuckDB, ElasticSearch, Redis
  
- 
    Data Lakes:
   
   Delta Lake, Iceberg, Hudi, Parquet
  
- 
    Streaming:
   
   Kafka, Redpanda, Pulsar
  
- 
    Cloud & Infra:
   
   AWS (S3, EMR, Lambda), GCP, Azure, Kubernetes
  
- 
    Version Control & Lineage:
   
   DVC, MLflow, Feast, Great Expectations
  
- 
    Visualization / Monitoring:
   
   Grafana, Prometheus, Superset, DataDog
  
- 
    Tools for Finance:
   
   kdb+/q (for tick data), InfluxDB, QuestDB
  
   What You Will Gain
  
- 
    End-to-end ownership
   
   of core data infrastructure in a high-impact, mission-critical domain.
  
- 
   Deep exposure to
   
    quantitative research workflows
   
   ,
   
    market microstructure
   
   , and
   
    real-time trading systems
   
   .
  
- 
    Collaboration with elite quantitative researchers, traders, and ML scientists.
   
- 
   Hands-on experience with
   
    cutting-edge distributed systems
   
   and
   
    time-series data technologies
   
   .
  
- 
   A culture that emphasizes
   
    technical excellence, autonomy, and experimentation.
   
   Qualifications
  
- 
   Bachelor’s or Master’s in
   
    Computer Science, Data Engineering, or related field.
   
- 
    2+ years
   
   of experience building and maintaining
   
    production-grade data pipelines
   
   .
  
- 
   Proficiency in
   
    Python
   
   ,
   
    SQL
   
   , and frameworks like
   
    Airflow
   
   ,
   
    Spark
   
   , or
   
    Flink
   
   .
  
- 
   Familiarity with
   
    cloud storage and compute (S3, GCS, EMR, Dataproc)
   
   and
   
    versioned data lakes (Delta, Iceberg)
   
   .
  
- 
   Experience with
   
    financial datasets
   
   ,
   
    tick-level data
   
   , or
   
    high-frequency time series
   
   is a strong plus.
  
- 
   Strong understanding of
   
    data modeling, schema design, and performance optimization
   
   .
  
- 
   Excellent communication skills with an ability to support
   
    multidisciplinary teams
   
   .