Principal Site Reliability Engineer

Location: Remote
Compensation: To Be Discussed
Reviewed: Thu, May 21, 2026
This job expires in: 30 days

Job Summary

Owning the reliability and operational excellence of cloud-based services, the full-time remote Principal Site Reliability Engineer will define reliability standards, drive SRE practices, and build systems to maintain production infrastructure health.

Key Responsibilities:
  • Define and refine Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs) for critical services
  • Lead the design and implementation of observability platforms and drive initiatives to reduce operational toil through automation
  • Champion reliability engineering best practices and mentor team members on SRE philosophy and cloud engineering
Required Qualifications:
  • 7+ years of experience in scalable, distributed systems architecture
  • 3+ years of hands-on Site Reliability Engineering experience, including SLOs and error budget management
  • 4+ years of experience with Cloud Platforms, particularly AWS
  • 4+ years of experience in infrastructure as code, such as Terraform or AWS CDK
  • 5+ years of experience in scripting using Python, Shell, or similar languages

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...