Senior Site Reliability Engineer
Job is Expired
Location: Remote
Compensation: Salary
Reviewed: Mon, Feb 23, 2026
Job Summary
A company is looking for a Senior Site Reliability Engineer, AI Factory.
Key Responsibilities
- Run commissioning and provisioning for GPU systems and manage firmware versions
- Monitor hardware state, identify bottlenecks, and ensure peak performance
- Develop operations strategies and maintain consistency with SLAs across infrastructure
Required Qualifications
- BS or MS degree in Computer Engineering/Science or related field, with 10+ years of relevant experience
- Experience managing GPU fleets and improving data center operations
- Expertise in BMS & Power management and configuration management solutions
- Experience with Datacenter Inventory Management Systems and developing QCOW2 images
- Proven track record of collaboration with multiple teams to achieve operational excellence
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...
Job is Expired