Senior Site Reliability Engineer

This job has been removed

Location: Remote

Compensation: To Be Discussed

Reviewed: Thu, Jun 25, 2026

This job expires in: 21 days

Job Category: Information Technology

Weekly Hours: Full Time

Employment Status: Permanent

Employer Type: Employer

Career Level: Experienced

Job Summary

Seeking a Senior Site Reliability Engineer for a full-time remote position focused on ensuring the reliability of high-load production systems, managing monitoring and alerting setups, and mentoring team members while supporting both cloud and on-premises deployments.

Key responsibilities

Ensure the reliability of services by managing SLIs/SLOs and identifying bottlenecks across the system
Set up monitoring, metrics, alerts, and dashboards, determining key metrics to measure and present them clearly
Investigate incidents, participate in on-call rotations, and lead postmortems to prevent future failures

Required qualifications

5+ years of experience in SRE/DevOps with a focus on high-load production systems
Deep practical knowledge of Docker and Kubernetes, with production experience
Hands-on experience with Prometheus, Alertmanager, and Grafana for metrics and alerts
Strong coding skills in Python for automation and tooling purposes
Experience with cloud platforms such as GCP and/or AWS, alongside solid Linux and networking skills

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...

Apply

Company Company Name

Headquarters Headquarters

Founded Founded

Website

The company description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...