Staff Site Reliability Engineer
Location: Remote
Compensation: To Be Discussed
Reviewed: Wed, Jun 17, 2026
This job expires in: 28 days
Job Summary
Leading the global platform reliability, the full-time Staff Site Reliability Engineer will drive the observability strategy on Google Cloud Platform (GCP), manage complex networking infrastructure, and optimize high-throughput data environments, all while working remotely from anywhere in the United States or Canada.
Key Responsibilities:
- Architect, optimize, and troubleshoot complex networking infrastructure across all OSI layers
- Design and scale the unified observability platform using the Grafana Labs suite
- Deploy machine learning models for automated anomaly detection and intelligent alerting
Required Qualifications:
- 8+ years of experience in SRE, Production Engineering, or Distributed Systems infrastructure roles
- Expertise in Google Kubernetes Engine (GKE) and orchestration/containerization
- Proven experience managing high-throughput Apache Kafka pipelines and large-scale data environments
- Hands-on experience with the Grafana ecosystem, including Grafana Enterprise/Cloud and Prometheus
- Advanced proficiency in Go and Python for custom infrastructure tooling and data integration
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...