Site Reliability Engineer
Location: Remote
Compensation: Salary
Reviewed: Wed, Dec 10, 2025
This job expires in: 17 days
Job Summary
A company is looking for a Site Reliability Engineer to enhance observability and reliability practices within a distributed environment.
Key Responsibilities
- Own and evolve the observability stack using various monitoring tools and AWS services
- Design and maintain SLIs, SLOs, and error budgets to improve system reliability
- Support incident investigations and maintain observability cost efficiency
Required Qualifications
- Hands-on experience with production observability systems like Prometheus and Grafana
- Experience with Thanos or large-scale metrics systems
- Strong understanding of SLIs, SLOs, and incident response workflows
- Solid experience with Kubernetes and Infrastructure as Code (Terraform preferred)
- Proficiency in scripting or programming (Go, Python, or Bash)
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...