Back to Jobs

Infrastructure Engineer

Remote, USA Full-time Posted 2026-06-21

About the role Own the platform that powers our accelerator cloud. Your scope spans bare-metal provisioning, multi-tenant Kubernetes, SLURM scheduling, control planes, and the automation and observability that keep thousands of compute nodes running as a single production system. What you'll do Build the control plane and APIs that unify our compute fleet Own provisioning and lifecycle from rack bring-up to node retirement Operate the scheduling layer for training and inference workloads Architect multi-tenancy: isolation, quota, fairness, and accounting Build automation that eliminates manual operations Drive reliability, observability, and incident response across the fleet What you'll need BS in CS, EE, or related field, or equivalent experience 5+ years in infrastructure, platform, or backend engineering Advanced software engineering skills: Rust, Go, or Python Deep understanding of Linux, storage, and distributed systems Experience with workload schedulers: SLURM, Kubernetes scheduling, or equivalent Expertise with automation tooling: Terraform, Ansible, Helm Experience architecting multi-tenant systems Production SRE experience: on-call, incident response, observability What we offer Top-tier compensation structured to recognize and retain the best talent Meaningful equity Comprehensive medical, dental, vision, life, and disability insurance Parental leave for all new parents, including adoptive and surrogate journeys Flexible PTO Paid Holidays Relocation support Equal Employment Opportunity We're an Equal Opportunity Employer and do not discriminate on the basis of any protected status under applicable law. Apply To This Job

Similar Jobs