Remote Jobs Sign In

Senior Site Reliability Engineer

Location: Remote
Compensation: Salary
Reviewed: Mon, Jun 08, 2026
This job expires in: 26 days

Job Summary

Joining a high-performing team remotely, the full-time Senior Site Reliability Engineer will own the reliability and automation of critical AI infrastructure, ensuring systems are resilient and secure while building automation tools to streamline operational workflows.

Key responsibilities
  • Manage the reliability, monitoring, and incident response lifecycle for AI infrastructure services, including on-call support and root cause analysis
  • Develop automation and tooling to enhance operational IT workflows and improve deployment velocity across CI/CD frameworks and Kubernetes environments
  • Collaborate with the Infrastructure team to extend CI/CD frameworks and integrate security tools into deployment pipelines
Required qualifications
  • 5+ years of experience in automating and supporting cloud infrastructure (AWS) and network environments
  • Proven experience with containerized workloads using Docker and Kubernetes in production settings
  • Proficiency in at least one scripting or programming language (Python, Bash, Ruby, or Go)
  • Experience leading incident response in environments with strict SLAs, including root cause analysis and measurable reliability improvements
  • Ability to utilize generative AI responsibly while maintaining human oversight in workflows

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...