Remote Jobs Sign In

Senior Software Engineer

Location: Remote
Compensation: Salary
Reviewed: Thu, Jun 04, 2026
This job expires in: 30 days

Job Summary

Leading the optimization and benchmarking of distributed training and inference workloads, the full-time Senior Software Engineer will manage large-scale AI clusters and ensure efficient performance across NVIDIA GPU platforms, with opportunities for remote work.

Key responsibilities
  • Lead the bring-up, validation, and debugging of large-scale AI clusters and end-to-end workloads
  • Profile and optimize workload performance using tools such as Nsight Systems and NCCL tests
  • Conduct root-cause analysis of failures and build resilience and failure-attribution capabilities for large clusters
Required qualifications
  • Bachelor's or Master's in Computer Science or a related technical field (or equivalent experience)
  • 8+ years of experience in software infrastructure for large-scale AI or HPC systems
  • Expertise in debugging and triaging AI applications across the full stack
  • Deep hands-on experience with NCCL and CUDA-aware distributed execution
  • Proficient in Python and C/C++ programming

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...