[Remote] Technical Program Manager, Data Center Operations
Note: The job is a remote job and is open to candidates in USA. Fluidstack is focused on delivering vast compute power for AI, and they are seeking a Technical Program Manager for Data Center Operations. The role involves overseeing the site handover framework, ensuring smooth transitions from construction to live operations, and managing critical operational processes to enhance stability and efficiency.
Responsibilities
- Own the end-to-end site handover framework: define the gates, acceptance criteria, and sign-off procedures that move a new facility from construction to live operations without dropped terms or late surprises
- Embed into design, construction, and due diligence teams early enough to shape maintainability requirements before they become field problems
- Drive the cross-functional handover rhythm across training, documentation, systems access, and knowledge transfer, surfacing blockers weeks before they hit the go-live schedule
- Build and maintain the SOPs that govern critical datacenter operations across the fleet, with metrics that track adoption, execution quality, and efficiency at each site
- Lead incident management and stability improvement programs, including post-incident reviews with root cause analysis, corrective action tracking, and preventive maintenance oversight that reduces unplanned outages across the global footprint
- Produce the dashboards and reporting that give leadership visibility into stability metrics and incident trends, and run the CAPA programs that turn that data into durable fixes
Skills
- You have run program management in mission-critical environments where a delayed handover or missed SOP had real operational consequences, not just schedule slippage
- You have designed operational frameworks from scratch: handover gates, SOP libraries, incident management programs built without a legacy system to copy from
- You quarterback across design, construction, supply chain, and site ops teams simultaneously, and other teams call you when a cross-functional workstream is stuck
- You write clearly enough to distill a complex operational issue into a decision and a next action for a site lead, an executive, or a counterparty who was not in the room
- You track incident trends and CAPA status in live dashboards and follow corrective actions through to closure, not just to initial assignment
- You have personally built or maintained SOPs and measured whether they were actually followed, not just whether they existed
- Bonus: ITIL, PMP, or PgMP certification
- Hyperscale or large colo operator experience
- Familiarity with ASHRAE, Uptime Institute, or TIA-942 standards
- Exposure to datacenter construction and commissioning processes
Benefits
- Offers Equity
- Retirement or pension plan, in line with local norms
- Health, dental, and vision insurance
- Generous PTO policy, in line with local norms
Company Overview
Company H1B Sponsorship