Job Summary
An internet services company is filling a position for a Remote Network Operations Site Reliability Engineer.
Core Responsibilities Include:
- Improving the reliability, availability, and observability of network infrastructure, including both software and hardware systems
- Reducing human toil, through better automation, better tooling, and identifying or building those tools and automation
- Acting as escalation point and on-call support for issues raised by our network-operations team
Skills and Requirements Include:
- Bachelor's degree in Computer Science, a related technical field involving software/systems engineering
- Prior experience in DevOps or Site Reliability Engineering roles
- Familiarity with infrastructure monitoring tools, e.g., Prometheus, Nagios, Datadog, etc.
- Experience with designing, analyzing, and troubleshooting large-scale, public-facing, distributed systems
- Familiarity with infrastructure-as-code tools, e.g., Terraform, Puppet, etc.
- Ability to work in a fast-paced environment, supporting multiple concurrent projects