Site Reliability Engineer

Location: Remote
Compensation: To Be Discussed
Reviewed: Mon, May 25, 2026
This job expires in: 30 days

Job Summary

Responsible for ensuring the reliability and performance of mission-critical cloud services, the full-time Systems Engineer - Site Reliability Engineering will oversee incident management, drive automation efforts, and collaborate with cross-functional teams to align SRE strategies with business objectives.

Key responsibilities
  • Ensure the reliability, availability, and performance of cloud services through best practices in monitoring, alerting, and incident management
  • Oversee high-severity incident management, driving quick resolution and conducting post-incident analyses to prevent recurrence
  • Develop and execute SRE strategies aligned with business goals, communicating service health and performance metrics to stakeholders
Required qualifications
  • Undergraduate degree in engineering or computer science, or equivalent experience/certification
  • 5+ years of hands-on experience in designing and operating production-grade systems, with 2+ years as a Site Reliability Engineer
  • Deep understanding of SRE practices, including Service Level Objectives and Incident Response Processes
  • Expertise in AWS services and experience with container orchestration engines like Kubernetes
  • Proven automation and programming experience in languages such as Python or PowerShell

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...