Senior Site Reliability Engineer
This job has been removed
Location: Remote
Compensation: To Be Discussed
Reviewed: Thu, Jun 25, 2026
This job expires in: 21 days
Job Summary
Seeking a Senior Site Reliability Engineer for a full-time remote position focused on ensuring the reliability of high-load production systems, managing monitoring and alerting setups, and mentoring team members while supporting both cloud and on-premises deployments.
Key responsibilities
- Ensure the reliability of services by managing SLIs/SLOs and identifying bottlenecks across the system
- Set up monitoring, metrics, alerts, and dashboards, determining key metrics to measure and present them clearly
- Investigate incidents, participate in on-call rotations, and lead postmortems to prevent future failures
Required qualifications
- 5+ years of experience in SRE/DevOps with a focus on high-load production systems
- Deep practical knowledge of Docker and Kubernetes, with production experience
- Hands-on experience with Prometheus, Alertmanager, and Grafana for metrics and alerts
- Strong coding skills in Python for automation and tooling purposes
- Experience with cloud platforms such as GCP and/or AWS, alongside solid Linux and networking skills
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...