← all jobs

Site Reliability Engineer

Work from home Full-time role Hiring

- The Site Reliability Engineer (SRE) is responsible for ensuring the availability, scalability, performance, and resiliency of enterprise cloud platforms across Azure, and AWS environments. This role combines software engineering, automation, and infrastructure expertise to operationalize reliability engineering practices, drive cloud-native resiliency patterns, and enable business-critical applications to meet defined SLAs, SLOs, and compliance requirements. The SRE partners with engineering, security, and operations teams to implement observability, incident response frameworks, and reliability automation, aligning with enterprise architecture standards and regulatory expectations. Key Accountabilities/Deliverables: Design and implement highly available, fault-tolerant architectures using cloud-native services (microservices, containers, serverless) Define and operationalize SLOs, SLIs, and error budgets for critical applications and platforms Build and maintain Infrastructure as Code (IaC) (Terraform) to ensure repeatable and compliant deployments Develop automated remediation and self-healing capabilities to reduce MTTR and improve system resilience Establish enterprise-level monitoring, logging, and observability frameworks (Datadog, Azure Monitor, CloudWatch, OpenTelemetry, Azure Application Insights) Drive cost optimization (FinOps) initiatives, including resource utilization tracking and rightsizing recommendations Support DR/BCP strategy execution, including failover testing and regional isolation validation Collaborate with application teams to embed reliability engineering practices into CI/CD pipelines Technical Knowledge and Understanding: Strong expertise in cloud platforms (Azure, AWS) Deep understanding of cloud-native architecture patterns (microservices, containers (Azure Container Apps/AKS/EKS), serverless (Azure Functions/AWS Lambda)) Proficiency in Infrastructure as Code (Terraform, ARM/Bicep) Experience with observability platforms (Datadog, Azure Monitor, Azure Application Insights) Knowledge of CI/CD pipelines and GitOps practices Expertise in system reliability concepts: SLI / SLO / SLA management Chaos engineering High availability & fault isolationFamiliarity with security, compliance, and regulatory controls (SOC, ISO, cloud security frameworks) Experience: 5+ years experience in Site Reliability Engineering, DevOps, or Cloud Engineering Proven experience supporting mission-critical production systems at scale Hands-on experience with incident management and on-call operations Experience implementing automated monitoring, alerting, and remediation frameworks Exposure to regulated environments (insurance, financial services) preferred Demonstrated ability to work across cross-functional architecture, engineering, and operations teams Applicants must be authorized to work for any employer in the U.S. We are unable to sponsor or take over work authorization sponsorship now or in the future for this position. - At Core Specialty, you will receive a competitive salary and opportunities for professional development and advancement. We offer medical, dental, vision, and life insurances; short and long-term disability; a Company-match of 100% of a 6% contribution 401(k) plan; an Employee Assistance Plan; Health Savings Account, Flexible Spending Account, Health Reimbursement Account, and a wellness program

More open positions

Senior Data Engineer

Work from home Full-time role

Specialist, Loan Quality Control - underwritter

Work from home Full-time role

Editorial Assistant

Work from home Full-time role

Supporter Services Executive

Work from home Full-time role

Consulting Solution Engineer

Work from home Full-time role

Senior Cinematic Animator

Work from home Full-time role

[Remote] Staff Data Scientist

Work from home Full-time role

Guidance Consultant

Work from home Full-time role

Remote Customer Support Assistant – Virtual Customer Experience Specialist (Entry-Level, Full-Time)

Work from home Full-time role

Remote Data Entry Specialist – High‑Volume Accuracy, Confidentiality & Process Optimization for careerzynith (Fully Remote)

Work from home Full-time role

[Remote] Sales Development Representative- EdTech

Work from home Full-time role

Staff Data Engineer (Python, LLM, Data Platforms) - Remote

Work from home Full-time role

[Remote] Lead Implementation Consultant, ERP (Apparel)

Work from home Full-time role

Senior Manager, Global Brand Marketing

Work from home Full-time role

Staff Security Analyst, Insider Threat Remote / Telecommute Jobs

Work from home Full-time role

Immediate Remote Data Entry Associate – Full‑Time & Part‑Time Roles with careerzynith, Work‑From‑Home Flexibility

Work from home Full-time role

[Remote] Data Platform Architect - Data Engineering (Streaming Solutions)

Work from home Full-time role

[Remote] Global Account Manager - Data Center Hyperscaler

Work from home Full-time role

Senior Product Manager, AI Platform Management

Work from home Full-time role

Experienced Workforce Management Specialist – Amazon Customer Service Operations

Work from home Full-time role

Data Engineering Lead - GCP Big Query

Work from home Full-time role