Back to Jobs

Senior Software Engineer - Observability

Remote, USA Full-time Posted 2026-06-21
Redpanda is pioneering the Agentic Data Plane (ADP) - a new category in AI infrastructure that makes it simple and secure to connect AI agents with enterprise data and systems. Built on a multi-modal data streaming engine, Redpanda empowers agentic applications that reason and act in real-time with speed, autonomy, and precision. Global leaders including Activision Blizzard, Cisco, Moody's, Texas Instruments, Vodafone and 2 of the top 5 banks in the U.S. rely on Redpanda to process hundreds of terabytes of data a day. Backed by premier venture investors Lightspeed, GV and Haystack VC, Redpanda is a diverse, people-first organization with teams distributed around the globe.

About the Role:

We are looking for a Senior Software Engineer to join our Observability team and help build the platform that gives Redpanda’s engineering organization deep visibility into the health, performance, and behavior of our systems. You will own and evolve our Grafana-based observability stack—spanning metrics, logs, and traces—and ensure that every team at Redpanda has the tooling and insights they need to ship reliable, high-performance software.

This is a high-impact role at the intersection of infrastructure and developer experience. You will work closely with platform and product engineering teams to design scalable observability solutions, drive adoption of best practices, and reduce mean time to detection and resolution across our cloud and on-premise deployments.

You Will:

  • Design, build, and maintain Redpanda’s observability platform using the Grafana stack (Grafana, Mimir, Loki, Tempo, Alloy/Agent)
  • Develop and optimize dashboards, alerts, and SLO/SLI frameworks that give engineering teams actionable insights into system health
  • Build and operate scalable metrics, logging, and distributed tracing pipelines that handle high-cardinality data across cloud and on-premise environments
  • Instrument services and infrastructure with OpenTelemetry to ensure comprehensive, standards-based telemetry collection
  • Partner with platform teams to improve incident detection, root-cause analysis, and mean time to resolution (MTTR)
  • Evaluate and integrate new observability tools and techniques, driving continuous improvement of our monitoring capabilities
  • Contribute to internal tooling and automation that streamlines observability onboarding for engineering teams
  • Participate in on-call rotation to keep observability infrastructure running and incident free

You Have:

  • 5+ years of experience in software engineering with a focus on observability, monitoring, or infrastructure
  • Deep hands-on experience with the Grafana stack (Grafana, Mimir/Prometheus, Loki, Tempo) in production environments
  • Strong understanding of metrics, logging, and distributed tracing paradigms and their trade-offs at scale
  • Experience with OpenTelemetry (OTel) for instrumentation and telemetry collection
  • Proficiency in at least one systems-level language (Go strongly preferred) and scripting languages (Python, Bash)
  • Experience running and operating infrastructure on Kubernetes in public cloud environments (AWS, GCP, or Azure)
  • Comfortable working with a 100% distributed engineering team, collaborating on GitHub, etc.
  • Solid understanding of time-series databases, log aggregation systems, and query languages (PromQL, LogQL)

Nice to Have

  • Strong understanding of Go
  • Experience operating a SaaS platform with production observability at scale
  • Familiarity with eBPF-based observability or continuous profiling tools (e.g., Pyroscope, Parca)
  • Experience with infrastructure-as-code (Terraform, Pulumi) and GitOps workflows
  • Operated and used streaming platforms (e.g., Kafka, Redpanda) either as a user or provider
  • Experience building or managing multi-tenant observability platforms
  • Contributions to open-source observability projects (Grafana, Prometheus, OpenTelemetry, etc.)

Join Redpanda if you’d enjoy being part of a fast-moving, diverse, people-first organization with team members around the globe and a culture based on trust, transparency, communication, and kindness.

#LI-Remote
Apply To This Job

Similar Jobs

CSR - Remote Dispatcher

Remote, USA Full-time

Riverence Holdings LLC - Food Safety and Operations Manager

Remote, USA Full-time

Sales - Territory Field Manager Midwest Start Up Medical Device

Remote, USA Full-time

Spot Pricing Analyst

Remote, USA Full-time

Logistics Sales Representative

Remote, USA Full-time

Tax Analyst

Remote, USA Full-time

Business Analyst II

Remote, USA Full-time

Cohere Life, Inc - Corporate Recruiter

Remote, USA Full-time

21000318 EJECUTIVOS DE VENTAS CUNEN

Remote, USA Full-time

Business Analyst Application Programming Interface

Remote, USA Full-time

Experienced Part-Time Evening Remote Data Entry Specialist - Join arenaflex's Dynamic Team in Data Management Excellence

Remote, USA Full-time

Senior Software Development Engineer (Scala 2/3 FP, Elasticsearch, Spark) - Remote US

Remote, USA Full-time

Experienced Data Entry Clerk – Remote Work From Home Focus Group Panelist – Part-Time Opportunity with Arenaflex

Remote, USA Full-time

Experienced Data Entry/Analysis Clerk – Remote Opportunity for a Detail-Oriented Professional

Remote, USA Full-time

Hibernate Senior Software Engineer

Remote, USA Full-time

Experienced Athletic Trainer and Fitness Professional – Delivering Exceptional One-on-One Assisted Stretching Sessions and Group Classes in a Dynamic Boutique Fitness Environment

Remote, USA Full-time

Experienced Remote Data Entry Specialist – Flexible Work from Home Opportunity with Competitive Hourly Rate and Professional Growth

Remote, USA Full-time

Life Insurance Advisor - Remote (Commission-based)

Remote, USA Full-time

Urgently Require RN Health Coach in Fargo, ND

Remote, USA Full-time

ProductManager for Rapidly Growing Solar Tech Startup

Remote, USA Full-time