← all jobs

[Remote] Senior Site Reliability Engineer, CORE (Member Experience / Resilience Operations)

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. Netflix is a global entertainment company that aims to push the boundaries of storytelling and technology. They are seeking a Senior Site Reliability Engineer to enhance the reliability and operational excellence of their streaming services, collaborating closely with engineering teams to ensure a seamless member experience.

Responsibilities

  • Design and evolve resilient infrastructure for Netflix member-facing services, ensuring our systems are scalable, fault-tolerant, and operable at a global scale
  • Take a data-driven approach to reliability to identify and address systemic risk
  • Partner with engineering and product teams to embed reliability and observability into the full software development lifecycle—from design and readiness reviews through rollout and ongoing operations
  • Define and measure Service Level Objectives (SLOs) and other reliability metrics that matter to the member experience, using them to guide capacity planning, operational priorities, and tradeoffs between reliability, feature velocity, and cost
  • Build and improve automated processes for deployment, monitoring, capacity management, and incident response to ensure our operations are fast, reliable, and repeatable
  • Participate in on-call rotations for critical Streaming services, helping ensure 24/7 availability and a great member experience
  • Lead and contribute to incident response—from triage and mitigation through follow-ups—focusing on learning, systemic fixes, and avoiding repeat issues
  • Proactively identify and reduce sources of instability in distributed systems by analyzing how our systems actually fail in production and driving architectural or operational improvements
  • Champion a culture of reliability across business domains, acting as a force multiplier: creating clear documentation, developing best-practice guides, and building tooling that enables other teams to adopt reliability improvements at scale

Skills

  • 5+ years of experience in an SRE, Production Engineering, or similar role operating business-critical, high-traffic services in production
  • Strong coding skills in one or more languages such as Python, Go, or Java, with a focus on automating solutions instead of relying on manual operations
  • Fluency in modern cloud infrastructure: hands-on experience with large-scale environments on AWS/Azure/GCP, along with abstracted compute and platform orchestration systems
  • Deep understanding of large-scale distributed systems, including common failure modes, performance bottlenecks, and how to design for resilience and graceful degradation
  • Track record of proactively identifying reliability risks and gaps through metrics, incidents, architecture reviews, or resilience testing—and implementing pragmatic, scalable solutions to mitigate them
  • Strong observability and performance tuning skills: you can use metrics, logs, and traces to debug issues in complex systems, and you're comfortable profiling and optimizing services to meet latency, availability, or efficiency goals
  • Experience with incident management and response: you can navigate ambiguous, high-pressure production issues, drive coordinated response, and follow through with durable improvements
  • Strong collaboration and influence skills: you communicate clearly, build trust with partner teams, and can guide engineering teams toward better reliability practices without relying on authority
  • Ability to balance reliability, velocity, and cost: you're comfortable making and explaining tradeoffs, and using data (SLOs, error budgets, performance metrics) to guide decision-making
  • Growth mindset and curiosity: you are eager to learn, comfortable challenging assumptions (including your own), and motivated by continuous improvement of systems, processes, and yourself
  • Embraces agency: you thrive when given a loosely defined goal by coming up with work to accomplish the goal while farming for dissent and feedback from the team and our stakeholders

Benefits

  • Health Plans
  • Mental Health support
  • A 401(k) Retirement Plan with employer match
  • Stock Option Program
  • Disability Programs
  • Health Savings and Flexible Spending Accounts
  • Family-forming benefits
  • Life and Serious Injury Benefits
  • Paid leave of absence programs
  • Full-time hourly employees accrue 35 days annually for paid time off to be used for vacation, holidays, and sick paid time off
  • Full-time salaried employees are immediately entitled to flexible time off

Company Overview

  • Netflix is an online streaming platform that enables users to watch TV shows and movies. It was founded in 1997, and is headquartered in Los Gatos, California, USA, with a workforce of 10001+ employees. Its website is https://www.netflix.com.
  • Company H1B Sponsorship

  • Netflix has a track record of offering H1B sponsorships, with 152 in 2026, 310 in 2025, 309 in 2024, 191 in 2023, 261 in 2022, 268 in 2021, 225 in 2020. Please note that this does not guarantee sponsorship for this specific role.
  • More open positions

    [Remote] Distributed Systems Engineer 4 - Content & Business Products

    Work from home Full-time role

    [Remote] Principal Security Engineer - Threat Intelligence

    Work from home Full-time role

    [Remote] Data Scientist

    Work from home Full-time role

    [Remote] Full Stack Engineer | $75/hr Remote

    Work from home Full-time role

    [Remote] Lead DevOps Engineer (AWS)

    Work from home Full-time role

    Marketing Analyst (Remote)

    Work from home Full-time role

    Experienced Data Entry Specialist – Entry-Level, Remote Opportunity at careerzynith

    Work from home Full-time role

    Experienced Remote Data Entry Specialist – Virtual Work Opportunity at careerzynith

    Work from home Full-time role

    CO Analyst - Customer Service with Dutch (m/f/x)*

    Work from home Full-time role

    [Remote] Senior Data Engineer

    Work from home Full-time role

    Entry Level Traffic Designer / Engineer

    Work from home Full-time role

    [MA] RA/Gastroenterology, Medical Affairs Scientist, Medical Manager/Associate (Field Medical)

    Work from home Full-time role

    [Remote] Data Annotator (AI Training)

    Work from home Full-time role

    Experienced Travel Agent

    Work from home Full-time role

    Associate General Counsel, U.S. Government Affairs and Regulatory

    Work from home Full-time role

    Senior Actuary job at Pacific Life in Newport Beach, CA, Omaha, NE, Charlotte, NC

    Work from home Full-time role

    Stock Associate, Seasonal Flex Only, Vacaville - Williams Sonoma Outlet

    Work from home Full-time role

    Standards Support Specialist - FS

    Work from home Full-time role

    [Remote] Software Engineering Manager - Financial Reporting

    Work from home Full-time role

    Sr. Director, Internal Audit

    Work from home Full-time role

    Business Analyst – Life & Annuity Insurance (Remote)

    Work from home Full-time role