Senior Site Reliability Engineer

Location: Remote
Compensation: Salary
Staff Reviewed: Mon, Nov 25, 2024
This job expires in: 22 days

Job Summary

A company is looking for a Senior Site Reliability Engineer to lead their observability initiative.

Key Responsibilities
  • Design and implement a comprehensive observability strategy using Datadog
  • Develop and maintain sophisticated alerting frameworks and optimize SLIs, SLOs, and error budgets
  • Lead incident response and postmortem analyses while automating toil reduction through infrastructure as code

Required Qualifications
  • 5+ years of hands-on SRE experience in large-scale production environments
  • Deep expertise with Datadog, including APM, Infrastructure Monitoring, and Log Management
  • Strong experience with monitoring as code using Terraform or similar tools
  • Proficiency in at least one programming language (Python, Go, or Java preferred)
  • Experience with cloud platforms (AWS, Azure, or GCP) and containerized environments

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...

BECOME A PREMIUM MEMBER TO
UNLOCK FULL JOB DETAILS & APPLY

  • ACCESS TO FULL JOB DETAILS AND APPLICATION INFORMATION
  • HUMAN-SCREENED REMOTE JOBS AND EMPLOYERS
  • COURSES, GROUP CAREER COACHING AND RESOURCE DOWNLOADS
  • DISCOUNTED CAREER SERVICES, RESUME WRITING, 1:1 COACHING AND MORE
  • EXCELLENT CUSTOMER SUPPORT FOR YOUR JOB SEARCH