[Remote] Senior Site Reliability Engineer, CCIP
Note: The job is a remote job and is open to candidates in USA. Chainlink Labs is the industry-standard oracle platform powering decentralized finance. As a Senior Site Reliability Engineer on the CCIP Platform team, you will ensure the reliability and operational excellence of systems for Chainlink's Cross-Chain Interoperability Protocol, influencing reliability practices and improving service availability.
Responsibilities
- Improve deployment safety and increase delivery velocity by advancing production engineering practices
- Establish distributed tracing across the platform to improve observability and accelerate incident investigation
- Eliminate operational toil through automation that increases engineering efficiency and platform reliability
- Drive adoption of meaningful SLOs, SLIs, and error budgets that guide engineering decisions and improve service health
- Increase platform scalability and operational readiness as CCIP continues to grow
- Strengthen Chainlink's reputation through highly available production systems while reducing operational overhead
Skills
- Demonstrated experience in Site Reliability Engineering, Production Engineering, or a similar role operating large-scale distributed systems
- Deep expertise defining, implementing, and driving adoption of SLOs, SLIs, and error budgets across engineering organizations
- Built and operated production Kubernetes environments supporting critical services
- Applied OpenTelemetry to improve observability across distributed systems
- Experience improving the reliability, scalability, and operability of production infrastructure
- Demonstrated technical leadership influencing reliability practices across engineering teams
- Experience performing capacity planning and performance tuning for high-throughput distributed services
- Previous experience working on Web3 infrastructure or within a crypto-native engineering organization
- Applied chaos engineering or fault-injection techniques to improve production resilience
- Partnered with software engineering teams to conduct production-readiness reviews before service launches
- Experience leading on-call operations, including defining rotations, escalation policies, and improving alert quality
Benefits
- All roles with Chainlink Labs are global and remote-based. Unless otherwise stated, we ask that you try to overlap some working hours with Eastern Standard Time (EST).
Company Overview