[Remote] Site Reliability Engineer with Linux
Note: The job is a remote job and is open to candidates in USA. Dice is looking for a Senior Site Reliability Engineer (Linux) to build and operate systems across their infrastructure. The role involves automation, debugging, and improving reliability in a large-scale hybrid environment.
Responsibilities
- Build automation for Linux host lifecycle (config, patching, images)
- Own system services, base images, and infrastructure components
- Debug production issues across OS, performance, and service layers
- Work across codebases (C, Go, Python, Ruby) to diagnose and fix issues
- Lead projects from ambiguous problems to production
- Improve reliability through automation and system design
- Partner on security and FedRAMP requirements
- Participate in a sustainable on-call rotation (~16 days/year)
Skills
- 7+ years working with Linux in production
- Strong automation skills (Python and/or Ruby, Ansible preferred)
- Experience debugging complex systems issues
- Comfortable working across cloud + on-prem environments
- U.S. Person required (FedRAMP; U.S.-based work)
- Docker / Kubernetes
- AMIs or container image building
- Go, C, or other systems-level languages
- Experience with compliance environments (FedRAMP, NIST, etc.)
Company Overview