Earthmover builds cloud data infrastructure for the heaviest scientific workloads of our time: climate and weather, Earth observation, genomics, ML, and large-scale AI research. Our platform supports multidimensional array data at scale, with correctness, performance, and scientific reproducibility at its core.
The platform consists of three major components:
-
Arraylake: a cloud-native data lake for multidimensional scientific array data, offering versioning, cataloging, access control, and collaboration;
-
Flux: a standards-compliant API gateway that serves multidimensional datasets to clients and applications;
-
Icechunk: our open-source Rust-based array database that underpins the entire system.
We are long-time contributors to the scientific data ecosystem through projects such as Zarr, Xarray, Icechunk, and xpublish-tiles. As we expand Arraylake and Flux and introduce new capabilities, we’re focused on problems that matter deeply to scientific users:
-
Fast, transparent array access along any dimension
-
Search, discoverability, and cross-organization collaboration
-
Performance, stability, and scalability across cloud and on-prem environments
-
Public datasets that can meaningfully accelerate global scientific work
-
Intuitive navigation and visualization of complex hierarchical scientific structures
-
Highly available, multi-region APIs for data delivery
As a scientific data platform company, we have two primary goals: deliver a world-class array storage and processing system, and build an exceptional collaborative experience on top of it. We’ve established a strong foundation; the next phase is shipping features and products that directly unblock scientific and analytical work at scale.
-
Maintain and expand Icechunk, our open-source Rust array database.
-
Design, build, and operate Rust-based pipelines, services, and execution engines within the Arraylake platform.
-
Improve our API performance, reliability, and overall quality—including observability, stability, versioning, and integration patterns.
-
5+ years of backend engineering experience.
-
1+ years of professional Rust experience (async Rust is a strong plus).
-
Write Python comfortably, or can become proficient quickly.
-
Experience building and testing distributed systems.
-
Product-minded and enjoy working directly with both scientists and customers to shape features around real workflows.
-
Enjoy partnering with other parts of the stack (e.g., frontend) to rapidly iterate on new product capabilities.
-
Are excited about the mission (above), even if you haven’t worked with every technology or responsibility listed. We have a strong team and can help the right candidate grow into parts of the role.
-
Cloud: Primarily AWS; also supporting Google Cloud and on-prem deployments
-
Languages: Python and Rust for services and array storage; Python for client libraries
-
Infra: Pulumi (Typescript) for IaC
-
Frontend: Next.js deployed on Vercel