Job Summary
A company that provides universal identity platform is in need of a Telecommute Senior Site Reliability Engineer.
Candidates will be responsible for the following:
- Maintaining services once they are live by measuring and monitoring availability, latency and overall system health
- Scaling systems sustainably through automation, and evolve systems by pushing for changes that improve reliability and velocity
- Being on-call for services that the SRE team on-boards
Must meet the following requirements for consideration:
- Interested in designing, analyzing and troubleshooting large-scale distributed systems
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive
- Great ability to debug and optimize code, and automate routine tasks
- Have designed applications and systems that scale, are resilient to failure, and are observable