[Remote] Network Front-End Engineer, AI Infrastructure Operations
Note: The job is a remote job and is open to candidates in USA. Nscale is a GPU cloud engineered for AI, providing infrastructure for AI-focused companies. The Network Front-End Engineer will design and operate front-end networking services, ensuring the reliability and efficiency of the networking infrastructure.
Responsibilities
- Designing, deploying, and operating high-speed Ethernet-based front-end networks (data centre fabrics, management networks, inference traffic paths, and storage connectivity)
- Working with deployment teams to ensure BOMs are correct and fit for purpose, with strong focus on optics, cabling, and hardware selection
- Providing input into DC layout, rack elevations, and reference architectures to ensure front-end networks are implemented correctly and scalably
- Troubleshooting performance and connectivity issues on Ethernet front-end networks, including routing, optics, and long-haul circuits
- Automating the deployment, initial stand-up, and day-to-day operations of multi-vendor Ethernet hardware and platforms
- Designing, implementing, and supporting WAN and long-haul infrastructure (including carrier circuits, DCI, and internet transit)
- Collaborating with Architecture and Operations teams to maintain high reliability, observability, and operational efficiency across front-end networks
Skills
- Proven experience in designing, deploying, and operating high-speed Ethernet networks, including technologies such as VLAN, LACP, MLAG, BGP, OSPF, EVPN, and VXLAN
- Strong knowledge of WAN and long-haul technologies, including MPLS, BGP, IPsec, GRE, SD-WAN, carrier Ethernet, and associated routing protocols
- Solid understanding of data centre networking topologies and best practices for front-end / infrastructure networks
- Strong Python or Bash scripting skills for automation and tooling
- Strong optics and hardware knowledge to assist with BOM design, cabling standards, and troubleshooting
- Experience working in fast-paced operational environments with a focus on reliability and automation
- Hands-on experience with Arista (EOS) and/or Nokia networking platforms
- Experience with large-scale DCI, long-haul optical transport, or carrier circuit management
- Familiarity with storage networking over Ethernet and shared storage connectivity
- Experience with network automation frameworks, validation tooling, or observability platforms (e.g., streaming telemetry, Prometheus, Grafana)
- Prior exposure to AI or hyperscale infrastructure environments
Company Overview