[Remote] Senior Site Reliability Engineer, Government
Note: The job is a remote job and is open to candidates in USA. SentinelOne is a company at the intersection of AI and security, pioneering a new operating model for cybersecurity. As a Senior Site Reliability Engineer, you will own the technical reliability of government environments and coordinate compliant, efficient deployments, collaborating with cross-functional teams to lead best practices for cloud infrastructure and government release processes.
Responsibilities
- Drive continuous software delivery, resolve incidents, run post mortems, and create automation strategies for deployment, self-testing, and alerting
- Lead and execute incident management for production issues, ensuring rapid recovery, root cause analysis, and preventative follow-up actions
- Improve and optimize the observability strategy by collaborating with application engineering teams to design monitoring solutions that enhance alerting capabilities and reduce noise
- Define, implement, and monitor SLOs, SLIs, and SLAs in collaboration with product and engineering teams to align with business objectives
- Design, develop, and maintain software solutions that address operational, compliance, and pipeline challenges
- Own and coordinate all government environment releases, driving process improvements to enhance the release pipeline's efficiency, reliability, and visibility
- Understand product architecture and service dependencies to manage risk and implement effective testing strategies
- Partner cross-functionally with engineering, product, SecOps, compliance, and leadership teams to align priorities, define testing strategies, and resolve challenges
- Ensure all infrastructure and deployments meet FedRAMP, government regulations, and industry standards, while maintaining required release documentation and risk assessments
Skills
- 5+ years of experience in SRE, DevOps, or Infrastructure Engineering for SaaS products, with 4+ years running operations at a large scale
- 2+ years of production experience with a container orchestration system and Continuous Delivery
- Strong understanding of compliance frameworks relevant to government deployments (e.g., FedRAMP, DoD, NIST 800 53, NIST 800 137)
- Multi cloud experience in AWS/Google Cloud Platform
- Demonstrated experience with at least one main programming language (Python, Go, Ruby, etc.) and proficiency in bash scripting to improve operational workflows
- Familiarity with GitOps frameworks, IaC tooling (Terraform or Pulumi), and deployment strategies (blue green, rolling deploys, canary deploys)
- Experience with industry standard observability stacks (Prometheus, Grafana, ELK, OpenTelemetry, etc.) and incident management processes
- Proven background implementing and supporting FedRAMP, security, risk management, and compliance processes for software releases
- Experience working directly with government agencies or in highly regulated industries
- Familiarity with testing strategies and automation in large scale environments
- Due to Federal Government contract requirements, U.S. Citizenship and a work location in the United States is required
- Kubernetes preferred
- Expertise within AWS preferred
Benefits
- Restricted Stock Units (RSUs)
- Employee Stock Purchase Plan (ESPP)
- Flexible time off
- Paid company holidays and paid sick time
- Gender-neutral parental leave
- Grandparent leave
- Medical, dental, and vision coverage
- 401(k) retirement plan with company match
- Life and disability insurance
- Health and dependent care FSA
- Voluntary benefits (hospital, accident, critical illness)
- Employee Assistance Program (EAP)
- ARAG pre-paid legal
- Nationwide pet insurance
- Cancer Care program
- Global business travel medical insurance
- Home office allowance
- Mobile phone reimbursement
- Wellness coach
- Wellness/gym reimbursement
- Fertility coverage
- Adoption & surrogacy reimbursement
Company Overview