Back to Jobs

[Remote] Senior Site Reliability Engineer

Remote, USA Full-time Posted 2026-06-16

Note: The job is a remote job and is open to candidates in USA. Lean Tech is a rapidly expanding organization in the technology services sector, seeking a highly experienced Senior Site Reliability Engineer. The role focuses on evolving the reliability, security, observability, and operational maturity of their cloud platform, leveraging AI tools and practices to enhance operational efficiency.

Responsibilities

  • Own and evolve the reliability, security, observability, and operational maturity of our cloud platform
  • Use AI tools and agentic workflows to automate infrastructure and SRE tasks
  • Manage production infrastructure for SaaS platforms, including senior AWS ownership
  • Lead production incidents and drive root-cause analysis, creating remediation plans
  • Ensure compliance with security best practices and maintain compliance controls

Skills

  • Expert use of AI tools and agentic workflows to automate infrastructure and SRE tasks
  • Hands-on experience using AI for Terraform development, incident triage, log analysis, runbook creation, postmortems, operational automation, CI/CD pipeline generation, and reducing repetitive operational work
  • Strong understanding of AI capabilities, limitations, and necessary validation processes
  • Ability to clearly articulate AI workflows, tooling choices, operational safeguards, and production outcomes
  • 10+ years managing production infrastructure for SaaS platforms, including 5+ years of senior AWS ownership
  • Deep expertise with AWS services such as ECS, VPC, IAM, RDS, S3, CloudFront, Route53, Lambda, API Gateway, CloudWatch, Secrets Manager, and related security and governance services
  • Advanced Terraform experience managing multi-account environments, infrastructure state, drift remediation, and dependency management
  • Advanced Terraform experience managing multi-account, multi-workspace infrastructure
  • Strong understanding of: provider versioning, state management, drift detection and remediation, dependency management, infrastructure blast radius analysis
  • Proven experience resolving production infrastructure drift safely
  • Significant experience leading production incidents as the accountable owner
  • Ability to operate calmly and effectively during high-severity outages
  • Proven experience authoring detailed postmortems and operational remediation plans
  • Strong understanding of operational risk management and production recovery procedures
  • Proven experience leading production incidents, driving root-cause analysis, and creating remediation plans
  • Strong background in observability, monitoring, logging, distributed tracing, and alerting using tools such as Grafana
  • Experience owning CI/CD pipelines, deployment strategies, infrastructure automation, and operational workflows
  • Strong Linux administration, containerization (Docker), networking, and scripting skills
  • Experience with security best practices, identity management (SAML, OIDC, SCIM), and compliance frameworks such as SOC 2, ISO 27001, HIPAA, or PCI
  • Comfortable working directly with auditors and maintaining compliance controls
  • Experience supporting Spring Boot or JVM-based systems in production
  • Experience with runtime security or EDR tooling such as Falco
  • Experience automating joiner/mover/leaver identity workflows using SCIM and IdP tooling
  • AWS certifications including: AWS Solutions Architect Professional, AWS DevOps Engineer Professional, AWS Security Specialty
  • Ability to read and debug Kotlin or Java backend services from an SRE perspective
  • React/NodeJS/Backstage developer experience
  • MuleSoft API Management experience

Benefits

  • Professional development opportunities with international customers
  • Collaborative work environment
  • Career path and mentorship programs that will lead to new levels

Company Overview

  • Global Technology Services (GTS) is the technology solution of Lean Solutions Group, helping companies scale faster through AI-driven automation, software development, and tech-powered talent. It was founded in 2019, and is headquartered in Medellín, Antioquia, COL, with a workforce of 1001-5000 employees. Its website is https://www.lean-tech.io/.
  • Company H1B Sponsorship

  • Lean Tech has a track record of offering H1B sponsorships, with 1 in 2023, 1 in 2022. Please note that this does not guarantee sponsorship for this specific role.
  • Apply To This Job

    Similar Jobs