Senior AI Infrastructure Engineer
Location: Remote
Compensation: To Be Discussed
Reviewed: Tue, Jun 30, 2026
This job expires in: 26 days
Job Summary
As a technical leader in a full-time remote position, the Senior AI Infrastructure & Platform Operations Engineer will drive operational excellence across large-scale AI infrastructure environments, focusing on NVIDIA GPU technology, Kubernetes, and service reliability while mentoring team members and resolving complex technical issues.
Key responsibilities:
- Lead the investigation and resolution of complex infrastructure and platform-related incidents
- Provide technical leadership for Kubernetes platform operations and drive improvements in operational processes
- Mentor AI Infrastructure & Platform Operations Engineers and develop operational standards and best practices
Required qualifications:
- 7+ years of experience in infrastructure operations, platform operations, or related technical roles
- Expert-level Linux administration and troubleshooting skills
- Strong experience operating Kubernetes in production environments
- Proven experience leading technical investigations and managing complex incidents
- Strong understanding of observability, monitoring, and service reliability practices
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...