Operations Engineer, HPC Networking
Location: Remote
Compensation: To Be Discussed
Reviewed: Fri, May 15, 2026
This job expires in: 30 days
Job Summary
Operations Engineer, HPC Networking, is a full-time, hands-on role focused on maintaining and optimizing InfiniBand and Ethernet fabrics, monitoring performance, and resolving connectivity issues.
Key Responsibilities
- Monitor health and performance of InfiniBand and Ethernet fabrics
- Investigate and resolve fabric issues, including connectivity and performance regressions
- Support fabric bring-up and collaborate with data center operations and customer-facing teams
Required Qualifications
- Experience operating InfiniBand fabrics in production environments
- Proficiency in debugging network components, including cables, switches, and drivers
- Ability to bring up new fabrics from cable pull through validation
- Experience with scripting for operational tasks (bash, python, go, etc.)
- Familiarity with Ethernet RoCE, Spectrum-X, or large-scale GPU cluster networking is a plus
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...