Operations Engineer, HPC Networking

Location: Remote
Compensation: To Be Discussed
Reviewed: Fri, May 15, 2026
This job expires in: 30 days

Job Summary

Operations Engineer, HPC Networking, is a full-time, hands-on role focused on maintaining and optimizing InfiniBand and Ethernet fabrics, monitoring performance, and resolving connectivity issues.

Key Responsibilities
  • Monitor health and performance of InfiniBand and Ethernet fabrics
  • Investigate and resolve fabric issues, including connectivity and performance regressions
  • Support fabric bring-up and collaborate with data center operations and customer-facing teams
Required Qualifications
  • Experience operating InfiniBand fabrics in production environments
  • Proficiency in debugging network components, including cables, switches, and drivers
  • Ability to bring up new fabrics from cable pull through validation
  • Experience with scripting for operational tasks (bash, python, go, etc.)
  • Familiarity with Ethernet RoCE, Spectrum-X, or large-scale GPU cluster networking is a plus

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...