← all jobs

Senior Software Engineer, Observability

Work from home Full-time role Hiring

About the Role: We are looking for a Senior Software Engineer to join our Observability team and help build the platform that gives Redpanda’s engineering organization deep visibility into the health, performance, and behavior of our systems. You will own and evolve our Grafana-based observability stack—spanning metrics, logs, and traces—and ensure that every team at Redpanda has the tooling and insights they need to ship reliable, high-performance software. This is a high-impact role at the intersection of infrastructure and developer experience. You will work closely with platform and product engineering teams to design scalable observability solutions, drive adoption of best practices, and reduce mean time to detection and resolution across our cloud and on-premise deployments. You Will: Design, build, and maintain Redpanda’s observability platform using the Grafana stack (Grafana, Mimir, Loki, Tempo, Alloy/Agent) Develop and optimize dashboards, alerts, and SLO/SLI frameworks that give engineering teams actionable insights into system health Build and operate scalable metrics, logging, and distributed tracing pipelines that handle high-cardinality data across cloud and on-premise environments Instrument services and infrastructure with OpenTelemetry to ensure comprehensive, standards-based telemetry collection Partner with platform teams to improve incident detection, root-cause analysis, and mean time to resolution (MTTR) Evaluate and integrate new observability tools and techniques, driving continuous improvement of our monitoring capabilities Contribute to internal tooling and automation that streamlines observability onboarding for engineering teams Participate in on-call rotation to keep observability infrastructure running and incident free You Have: 5+ years of experience in software engineering with a focus on observability, monitoring, or infrastructure Deep hands-on experience with the Grafana stack (Grafana, Mimir/Prometheus, Loki, Tempo) in production environments Strong understanding of metrics, logging, and distributed tracing paradigms and their trade-offs at scale Experience with OpenTelemetry (OTel) for instrumentation and telemetry collection Proficiency in Go and Python Experience running and operating infrastructure on Kubernetes in public cloud environments (AWS, GCP, or Azure) Comfortable working with a 100% distributed engineering team, collaborating on GitHub, etc. Experience with AI coding tools (e.g., Claude Code) and able to independently validate, refine, and productionize generated outputs Solid understanding of time-series databases, log aggregation systems, and query languages (PromQL, LogQL) Nice to Have: Strong understanding of Go Experience operating a SaaS platform with production observability at scale Familiarity with eBPF-based observability or continuous profiling tools (e.g., Pyroscope, Parca) Experience with infrastructure-as-code (Terraform, Pulumi) and GitOps workflows Operated and used streaming platforms (e.g., Kafka, Redpanda) either as a user or provider Experience building or managing multi-tenant observability platforms Contributions to open-source observability projects (Grafana, Prometheus, OpenTelemetry, etc.)

More open positions

Fullstack QA Engineer (Client App)

Work from home Full-time role

iOS Developer

Work from home Full-time role

System Analyst (Payment Gate)

Work from home Full-time role

Account Executive

Work from home Full-time role

Software Development Manager, AI Builder Tools

Work from home Full-time role

[Remote] Senior Engineer | Bankrate

Work from home Full-time role

Workday Financials - P2P Manager

Work from home Full-time role

Specialist, Provider Data Operations

Work from home Full-time role

Customer Service and Sales Representative - Bilinguals - Spanish job at Teleperformance in AL, AR, AZ, DE, FL, GA, IA, ID, IL, IN, KS, KY, LA, MD, ME, MI, MN, MO, MS, MT, NC, ND, NE, NJ, NM, NV, OH, OK, PA, RI, SC, SD, TN, TX, UT, VA, WI, WV, WY

Work from home Full-time role

Product Owner, AI Agents and Platform

Work from home Full-time role

Senior Customer Service Team Lead – Healthcare Client Billing & Revenue Cycle Operations (Remote)

Work from home Full-time role

Project Leadership – Biotech (clinical trials) – Oncology - Home Based - (Future Needs)

Work from home Full-time role

Senior Databricks Engineer

Work from home Full-time role

Chief of Staff, Care Transformation & Customer Enablement

Work from home Full-time role

Korean into English Copywriter

Work from home Full-time role

Managed Services Linux Engineer | Grand Rapids, MI or Remote

Work from home Full-time role

Part-Time In-Home Veterinarian - Winston-Salem, NC

Work from home Full-time role

Community Behavioral Health Support Advocate - Remote

Work from home Full-time role

Desarrollador(a) Backend Node.js / Microservicios (Remoto – Perú)

Work from home Full-time role

[Remote] Sr AI Engineer

Work from home Full-time role

Mitarbeiter:in Kundenservice & Backoffice (m/w/d)

Work from home Full-time role