[Remote] Senior Platform Engineer
Note: The job is a remote job and is open to candidates in USA. MOXFIVE is building technologies that leverage AI to streamline response, recovery, and resilience from cyber attacks in enterprises. They are seeking a Senior Platform Engineer to enhance the reliability, security, and deployability of their platform, directly impacting their engineering team's efficiency. The role involves owning cloud infrastructure, improving CI/CD pipelines, and ensuring operational readiness while promoting security and developer velocity.
Responsibilities
- Own and improve the platform foundation that helps a high-velocity engineering team ship safely across cloud infrastructure, Kubernetes, IaC, secrets, networking, access controls, CI/CD, observability, and production guardrails
- Build internal tooling for an AI-enabled engineering workflow, including automation, repo and CI feedback loops, agent-ready development environments, and safeguards that let engineers move quickly without weakening production discipline
- Strengthen operational readiness through better logging, metrics, tracing, alerting, runbooks, and incident follow-up
- Harden production access with least-privilege IAM, secure secret management, auditability, and controlled break-glass paths
- Set pragmatic platform standards that help a small team move quickly today while avoiding infrastructure, reliability, and security debt tomorrow
Skills
- 5+ years of experience in platform engineering, DevOps, SRE, infrastructure engineering, or backend-adjacent cloud operations
- A track record of owning production systems where reliability, security, and developer velocity all matter
- Hands-on experience with cloud infrastructure, Kubernetes, infrastructure-as-code, CI/CD, secrets management, access controls, and observability
- Experience building internal developer tooling, platform automation, or AI-assisted development workflows
- Comfort designing safe release processes with deployment gates, smoke tests, rollback paths, and clear ownership
- Practical experience supporting relational databases and production data changes
- A security-minded approach to infrastructure, including least privilege, auditability, secret handling, and controlled production access
- Clear written communication for runbooks, deployment notes, incident follow-ups, and engineering decisions
- Familiarity with agent harness design, agent sandboxing, including tool access, environment setup, state management, permissions, and production guardrails
- Experience managing production model inference across hosted providers such as Together AI or Fireworks.ai, GPU platforms such as RunPod or Lambda Cloud, Modal, or similar, or self-hosted serving stacks, including the tradeoffs between hosted APIs, dedicated deployments, serverless GPUs, and self-hosted inference stacks
Benefits
- Offers Bonus
Company Overview