Senior Site Reliability Engineer
Location: Remote
Compensation: Salary
Reviewed: Mon, Jun 01, 2026
This job expires in: 30 days
Job Summary
To support the GeForce NOW team, the full-time Senior Site Reliability Engineer will manage service reliability, drive tools development, and ensure uptime for GPU cloud gaming services while working remotely or onsite in Santa Clara.
Key responsibilities
- Build tools to enhance SRE observability and support Kubernetes migration efforts
- Rapidly debug and triage incidents, automating daily tasks to achieve full operational efficiency
- Collaborate with service owners to maintain and improve service SLOs through system design consulting and capacity management
Required qualifications
- MS or BS in Computer Science/Engineering or a related field, or equivalent experience
- 8+ years of experience in site reliability engineering with large-scale distributed microservices
- Strong background in Kubernetes and experience with multi-region cloud deployments on AWS, GCP, or Azure
- Proficiency in production-grade coding with languages such as Go, Python, or Bash scripting
- Experience with monitoring systems like Datadog or Prometheus and managing deployment pipelines using CI/CD tools
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...