Software Engineer - Reliability
Location: Remote
Compensation: To Be Discussed
Reviewed: Mon, Mar 09, 2026
This job expires in: 30 days
Job Summary
A company is looking for a Software Engineer - Reliability.
Key Responsibilities
- Architect systems for reliability and scale, participating in re-architecture sessions
- Take ownership of multi-cloud GPU clusters, ensuring high availability and performance
- Implement robust security practices to achieve and maintain security certifications
Required Qualifications
- 8+ years of experience as an SRE, production engineer, or infrastructure engineer
- Deep expertise in Linux and containerized systems
- Strong experience with cloud providers like AWS or OCI
- Familiarity with security best practices and compliance frameworks
- Experience with high-performance networking, such as InfiniBand or RDMA
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...