[Remote] Senior Manager Network Engineering
Note: The job is a remote job and is open to candidates in USA. Core42 is a leader in AI-powered cloud and digital infrastructure, driving transformative technology solutions globally. The Senior Manager of Network Engineering will lead the strategy, design, deployment, and operational excellence of network infrastructure supporting large-scale GPU, AI, and HPC environments.
Responsibilities
- Develop and execute the network engineering strategy for large-scale AI, GPU, and HPC infrastructure, ensuring scalability, performance, and operational resilience
- Lead the design, implementation, and lifecycle management of high-performance network environments, including Ethernet, InfiniBand, RoCEv2, management, storage, and tenant-isolated network fabrics
- Oversee multi-tenant network architectures, including segmentation, VRFs, VLANs, routing domains, and secure traffic flow between environments
- Establish and enforce network security standards, including firewall policy design, access controls, and traffic inspection aligned with organizational and regulatory requirements
- Partner with infrastructure engineering, security, and operations teams to ensure network architecture integrates with compute, storage, and data center design
- Collaborate with ISPs, carriers, and hardware vendors for service delivery, escalation management, capacity planning, and performance optimization
- Drive network observability through monitoring, alerting, logging, and performance analysis across critical services and transport layers
- Lead incident response and root cause analysis for major network events, ensuring timely resolution and continuous improvement
- Build, mentor, and manage a high-performing network engineering team, setting priorities, development plans, and technical standards
- Define and maintain documentation standards for architecture, topology, policies, procedures, and change management
- Drive automation and standardization using tools such as Ansible, Python, or equivalent frameworks
- Establish and track KPIs for availability, latency, throughput, incident response, and network reliability
- Serve as the primary network engineering liaison to senior leadership, providing roadmap recommendations and risk assessments
- Evaluate emerging technologies to improve performance, security, and scalability across AI infrastructure environments
Skills
- 10+ years of experience in network engineering, including large-scale data center, HPC, cloud, or AI environments
- 5+ years of leadership experience managing network engineering teams or programs
- Deep expertise in data center networking: BGP, EVPN/VXLAN, VRFs, VLANs, routing, switching, and segmentation
- Strong experience with InfiniBand, RoCEv2, and high-speed Ethernet in GPU or HPC environments
- Proven experience designing and operating secure multi-tenant network architectures
- Strong understanding of firewall platforms and network security controls in complex environments
- Experience working with ISPs, carriers, and vendors for provisioning and escalation management
- Hands-on experience with network monitoring, telemetry, packet capture, and troubleshooting tools
- Demonstrated ability to build, lead, and scale technical engineering teams
- Strong communication skills and ability to translate technical strategy into business impact
- Experience in GPUaaS, AI cloud, sovereign cloud, or large-scale HPC environments
- Knowledge of zero trust architecture and microsegmentation
- Familiarity with data center physical infrastructure (power, cooling) as it relates to networking
- Experience with infrastructure-as-code and network automation
- Certifications such as CCNP, CCIE, JNCIP, PCNSE, or equivalent
- Experience with multi-site or geographically distributed network environments
Benefits
- Bonus
- LTIP
- Benefits
Company Overview