Remote Jobs Sign In

Principal Developer, AI Networking

Location: Remote
Compensation: Salary
Reviewed: Fri, Jun 12, 2026
This job expires in: 26 days

Job Summary

Focusing on profiling, analyzing, and optimizing AI workloads on large-scale GPU and CPU clusters, the full-time Principal Developer, AI Networking will work remotely or onsite to enhance distributed Deep Learning LLM training and inference, with an emphasis on networking and performance analysis.

Key responsibilities
  • Characterizing AI workloads and deep learning models for large-scale LLM training and inference on NVIDIA supercomputers
  • Benchmarking, profiling, and analyzing performance to identify bottlenecks and optimization opportunities, particularly in networking
  • Developing tools for PyTorch trace-based profiling and collaborating with cross-functional teams to provide performance analysis insights
Required qualifications
  • B.Sc in Computer Science or Software Engineering or equivalent experience
  • 15+ years of experience with high-performance networking (RDMA, MPI, NCCL, SHARP)
  • Demonstrated ability in performance evaluation techniques and approaches
  • Experience with NVIDIA GPUs, the CUDA library, and deep learning frameworks like TensorFlow or PyTorch
  • Proficiency in programming languages: Python, Bash, and C++

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...