Lead Site Reliability Engineer
Location: Remote
Compensation: Salary
Reviewed: Tue, Jun 09, 2026
This job expires in: 30 days
Job Summary
To enhance observability and telemetry platforms, the full-time Lead Site Reliability Engineer will design and operate scalable infrastructure, improve operational efficiency, and partner with engineering teams while working remotely.
Key responsibilities
- Build and operate scalable observability and telemetry platforms that process logs, metrics, traces, and events across production environments
- Design resilient, automated infrastructure and platform services to improve reliability, scalability, and efficiency
- Troubleshoot complex production issues and participate in incident response and on-call processes
Required qualifications
- 7+ years of experience operating and engineering large-scale production infrastructure and distributed systems
- Strong expertise in Linux systems engineering, cloud infrastructure, and SRE practices
- Proven experience designing and operating observability and telemetry platforms
- Hands-on experience with tools such as OpenSearch/Elasticsearch, Kafka, Prometheus, and Grafana
- Experience building Infrastructure as Code solutions using Terraform, CloudFormation, or equivalent tools
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...