Staff Site Reliability Engineer
Location: Remote
Compensation: Salary
Reviewed: Tue, Jun 16, 2026
This job expires in: 30 days
Job Summary
Leading the development of AI-assisted reliability tooling, the full-time remote Staff Site Reliability Engineer will enhance observability, manage incident response, and mentor engineering teams to improve system reliability and operational efficiency.
Key responsibilities
- Lead the development of internal AI-assisted reliability tools to expedite outage resolution
- Improve observability coverage for critical customer-facing systems throughout the development lifecycle
- Own the end-to-end incident response process, ensuring thorough documentation and understanding of issues
Required qualifications
- Deep experience in Site Reliability Engineering, platform engineering, or software engineering with hands-on operational ownership
- Fluency with Kubernetes, Linux, cloud platforms, and observability tooling
- Strong software engineering skills in Python or Go, with a proven track record of building reliable internal tools
- Experience in improving reliability through engineering and automation
- Comfort in leading technically ambiguous work and influencing cross-team direction
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...