Job Summary
A company in the Information Technology industry is seeking a Lead Software Reliability Engineer to design and implement highly available and scalable infrastructure solutions, collaborate with software engineering teams, develop and maintain monitoring systems, build and maintain deployment pipelines, lead incident response efforts, and automate delivery of new products and services.
Position Responsibilities
- Design and implement highly available and scalable infrastructure solutions
- Collaborate with software engineering teams to ensure applications are built with reliability and observability in mind
- Develop and maintain monitoring, alerting, and logging systems to proactively identify and address potential issues
Required Qualifications
- Experience building CI/CD pipelines using tools such as GitHub actions, Jenkins, or ArgoCD
- Expertise in AWS, particularly CloudFormation
- Able to build secure and efficient Docker images and deploy those images to Kubernetes
- Fluent with Helm and/or Kustomize
- Fluent in one or more: Go, Python, Java, or Javascript