Senior Site Reliability Engineer

Location: Remote

Compensation: To Be Discussed

Reviewed: Wed, Mar 11, 2026

This job expires in: 20 days

Job Category: Information Technology

Employment Status: Permanent

Employer Type: Employer

Career Level: Experienced, Senior Level

Job Summary

A company is looking for a Senior Site Reliability Engineer.

Key Responsibilities

Own fleet reliability and lead the strategy for SaaS infrastructure, including defining SLOs and capacity planning
Design and evolve infrastructure on GCP and AWS, focusing on non-deterministic AI workloads
Drive operational excellence by evolving incident management practices and leveraging AI for root cause analysis

Required Qualifications

5+ years of experience operating cloud infrastructure (GCP and/or AWS) with Terraform and Kubernetes
Experience or strong interest in operating LLM-based systems or agentic workloads
Understanding of distributed systems principles and their application in infrastructure decisions
Proficiency in at least one modern programming language (TypeScript, Java, Go, or Python)
Ability to communicate complex infrastructure trade-offs to technical and non-technical stakeholders

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...

Apply

Company Overview

Company Company Name

Headquarters Headquarters

Founded Founded

Website

Wikipedia Wikipedia URL

BBB URL BBB URL

The company description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...