Principal Software Engineer

Location: Remote
Compensation: Salary
Reviewed: Mon, May 18, 2026
This job expires in: 29 days

Job Summary

To shape the technical direction for production engineering, the full-time Principal Software Engineer will define strategies for large-scale GPU cluster operations, focusing on automation and reliability in both cloud and on-prem environments.

Key responsibilities
  • Define and execute the technical strategy for DGX Cloud cluster operations, emphasizing automation and reliability
  • Lead the design and implementation of systems for cluster lifecycle management, validation, and observability
  • Mentor engineers and influence cross-functional teams in platform, infrastructure, and operational standards
Required qualifications
  • 15+ years of experience in building and operating large-scale distributed systems or cloud infrastructure
  • Deep expertise in Kubernetes, Linux, infrastructure automation, and production operations
  • Strong programming skills in Go, Python, or similar languages
  • Proven ability to lead complex cross-organizational technical initiatives
  • BS/MS in Computer Science or equivalent experience

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...