Site Reliability Engineer

Location: Remote
Compensation: To Be Discussed
Reviewed: Fri, Mar 13, 2026
This job expires in: 30 days

Job Summary

A company is looking for a Site Reliability Engineer.

Key Responsibilities
  • Develop and maintain observability solutions using platforms like Datadog, Prometheus, and Grafana
  • Lead incident management efforts, including coordinating responses and troubleshooting issues
  • Collaborate with product engineering teams to architect reliable systems and implement monitoring strategies
Required Qualifications
  • 4+ years of experience in Site Reliability Engineering or similar DevOps roles
  • 2+ years of hands-on experience with Kubernetes and managing its infrastructure
  • Strong experience with modern monitoring stacks including Prometheus, Grafana, and Datadog
  • Experience with Infrastructure as Code tools, like Terraform and Helm
  • Expertise with at least one major cloud service provider (AWS, GCP, Azure)

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...