Site Reliability Engineer
Location: Remote
Compensation: To Be Discussed
Reviewed: Mon, May 25, 2026
This job expires in: 30 days
Job Summary
Responsible for ensuring the reliability and performance of mission-critical cloud services, the full-time Systems Engineer - Site Reliability Engineering will oversee incident management, drive automation efforts, and collaborate with cross-functional teams to align SRE strategies with business objectives.
Key responsibilities
- Ensure the reliability, availability, and performance of cloud services through best practices in monitoring, alerting, and incident management
- Oversee high-severity incident management, driving quick resolution and conducting post-incident analyses to prevent recurrence
- Develop and execute SRE strategies aligned with business goals, communicating service health and performance metrics to stakeholders
Required qualifications
- Undergraduate degree in engineering or computer science, or equivalent experience/certification
- 5+ years of hands-on experience in designing and operating production-grade systems, with 2+ years as a Site Reliability Engineer
- Deep understanding of SRE practices, including Service Level Objectives and Incident Response Processes
- Expertise in AWS services and experience with container orchestration engines like Kubernetes
- Proven automation and programming experience in languages such as Python or PowerShell
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...