Remote Jobs Sign In

Software Engineer for AI Infrastructure

Location: Remote
Compensation: Salary
Reviewed: Thu, Jun 04, 2026
This job expires in: 28 days

Job Summary

Focusing on the benchmarking and optimization of distributed training and inference workloads, the full-time Software Engineer for AI Infrastructure will bring up, validate, and debug large-scale AI clusters while working remotely or onsite in various locations.

Key responsibilities
  • Bring up, validate, and debug large-scale AI clusters and end-to-end workloads
  • Benchmark AI pre-training, post-training, and inference workloads using NVIDIA AI software stacks
  • Perform root-cause analysis of failures and contribute to failure-attribution tooling across the cluster
Required qualifications
  • Bachelor's or Master's in Computer Science or a related technical field (or equivalent experience)
  • 3+ years of experience developing software for AI, HPC, or systems-level applications
  • Hands-on experience with multi-GPU or multi-node workloads and CUDA-aware distributed execution
  • Experience debugging and scaling distributed systems
  • Strong programming skills in Python and C/C++

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...