Principal Site Reliability Engineer

Location: Remote
Compensation: Salary
Reviewed: Mon, May 18, 2026
This job expires in: 29 days

Job Summary

Designing and implementing operational aspects of a large-scale Observability & Telemetry platform, the full-time Principal Site Reliability Engineer will focus on performance, real-time monitoring, and system health while engaging in the entire service lifecycle from inception to refinement, working in a remote environment.

Key responsibilities
  • Design, implement, and support operational and reliability aspects of the Observability & Telemetry platform
  • Engage in the lifecycle of services from design through deployment and operation, ensuring system health and performance
  • Practice sustainable incident response and participate in an on-call rotation to support production systems
Required qualifications
  • BS degree in Computer Science or a related technical field, or equivalent experience
  • 15+ years of experience with infrastructure automation and distributed systems design
  • 8+ years of experience delivering foundational infrastructure and observability platforms
  • Proficiency in one or more programming languages such as Python, Go, Perl, or Ruby
  • In-depth knowledge of Linux, Networking, and Containers

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...