Principal Site Reliability Engineer
Location: Remote
Compensation: Salary
Reviewed: Mon, May 18, 2026
This job expires in: 29 days
Job Summary
Designing and implementing operational aspects of a large-scale Observability & Telemetry platform, the full-time Principal Site Reliability Engineer will focus on performance, real-time monitoring, and system health while engaging in the entire service lifecycle from inception to refinement, working in a remote environment.
Key responsibilities
- Design, implement, and support operational and reliability aspects of the Observability & Telemetry platform
- Engage in the lifecycle of services from design through deployment and operation, ensuring system health and performance
- Practice sustainable incident response and participate in an on-call rotation to support production systems
Required qualifications
- BS degree in Computer Science or a related technical field, or equivalent experience
- 15+ years of experience with infrastructure automation and distributed systems design
- 8+ years of experience delivering foundational infrastructure and observability platforms
- Proficiency in one or more programming languages such as Python, Go, Perl, or Ruby
- In-depth knowledge of Linux, Networking, and Containers
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...