[Remote] Site Reliability Engineer

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. Second Sight Solutions, a subsidiary of Berkeley Research Group (BRG), is a health technology company that focuses on improving transparency in drug discount data exchange. They are seeking a Site Reliability Engineer to design, build, and maintain highly available systems and infrastructure while collaborating with software developers and operations teams to enhance system reliability.

Responsibilities

Design, implement, and maintain scalable and reliable systems in cloud environments such as Azure Cloud Services
Experience with CI/CD Platforms (GitHub Actions, GitLab CI)
Provide operational support for full-stack software applications
Increase system resilience with expert-level coding, bulletproof release, and change management skills
Develop service-level indicators and objectives to automate release validation
Improve automation and increase the system’s self-healing capability
Collect operating system data and report performance metrics to stakeholders
Ensure security best practices are followed in cloud infrastructure and application deployments
Manage cloud and database system maintenance, debugging production issues as they arise
Improve reliability, quality, and time-to-market of our suite of software solutions
Partner with security and product teams to define and publish policies, processes, and playbooks to facilitate rapid and effective handling of alerts and incidents
Lead incident management processes; respond to outages and service disruptions promptly

Skills

Bachelor's degree in computer science or similar field
Five years' experience as a site reliability engineer or similar role
Strong programming skills (Golang, Ruby, Python, or similar)
Proven ability to diagnose and monitor performance and reliability issues across the stack
Expertise in Kubernetes
Relevant industry certifications, such as through the Site Reliability Engineering (SRE) Foundation
Proven experience working with cloud-native infrastructure (Azure Cloud Services, AWS, or GCP)
Experience working with observability and incident management tools (Datadog, OpsGenie, PagerDuty)
Experience scripting operating system tasks with Infrastructure as Code
Impeccable communication skills
Ability to problem-solve in a fast-paced, high-stakes environment

Company Overview

BRG combines world-leading academic credentials with world-tested business expertise, purpose-built for agility and connectivity, which sets us apart—and gets our clients ahead. It was founded in 2010, and is headquartered in Emeryville, California, USA, with a workforce of 1001-5000 employees. Its website is http://www.thinkbrg.com.

Apply Now

[Remote] Site Reliability Engineer

More open positions

[Remote] Principal Software Engineer - Variant Knowledge Platform

[Remote] Operations & Administrative Specialist

[Remote] Senior Customer Success Manager, Corporate Legal

[Remote] Wealth Planning Consultant, Senior Specialist-2

[Remote] Human Resource Business Partner (HRBP)

Senior Travel Consultant

Experienced Automation Specialist – Personal & Business Workflow Optimization

Digital Marketing Contractor (Biotech/Pharma)

Manager, Recycling Operations - Alberta

Mental Health Therapist, LCSW, LMFT, LPC

Call Center Agent - Kreditabsicherung im Homeoffice (w/m/d)

Senior Manager, Pharmacovigilance

Wellness Coach

[Remote] Senior Full Stack Engineer

Remote Sales ( $100K+, Training Provided, No Cold Calls )

Senior Platform Engineer

[Remote] Social Media and Community Manager

[Remote] Data Engineer

Political Analyst

[Remote] Inside Sales Representative - Early Education

Email Marketing Developer (Remote)