Remote Site Reliability Engineer

Job is Expired
Location: Nationwide
Compensation: Salary
Staff Reviewed: Mon, Apr 26, 2021

Job Summary

The Site Reliability Engineer is responsible for the availability and reliability of critical platform services and applications, ensuring they meet the requirements of internal and external users.

This position offers the chance to positively impact patient outcomes in healthcare by ensuring the availability and reliability of services used by healthcare providers to distribute important medical data. In addition, this position is a ground-floor opportunity to be instrumental in the transformation of an industry leader's offerings. You will be a key contributor in the transition from running services in a datacenter to providing services and capabilities that are scalable, always available and cloud-native.

If you enjoy solving hard problems and want to be a part of a legacy that impacts our world by improving patient outcomes, this job may be for you!

Main Responsibilities:
Creates solutions using cloud technologies to solve client technical and business challenges
Participates in system design consulting, platform management, and capacity planning
Implements new tools and techniques to increase scalability and performance
Architects and builds automation tools to increase reliability and speed
Implements CI/CD solutions to increase traceability and developer experience
Implements monitoring and logging systems. Gathers and analyzes metrics from both operating systems and applications to assist in performance tuning and fault finding
Measures and optimizes system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
Partners with development teams to improve services through rigorous testing and release procedures
Creates sustainable systems and services through automation of tasks and uplifts
Builds software and systems to manage platform infrastructure and applications
Improves reliability, quality, and time-to-market of our suite of software solutions
Maintains or refactors existing processes to align with serverless architecture.
Plans and conducts technical tasks associated with the implementation and maintenance of internal cloud enterprise-shared virtualization infrastructure.
Deploys software to cloud computing infrastructure, and works with system configuration and deployment automation technologies, working with ETL tools and techniques.
Performs the implementation, operational support, maintenance, and optimization of network hardware, software, and communication links of the cloud infrastructure.
Resolves complex problems, creates and improves procedures, and facilitates communication.

Education & Experience:
Bachelor's degree preferred
A minimum of 5 years of experience as a Site Reliability Engineer.
Strong attention to detail
Demonstrated oral and written communication skills.
Ability to work independently and meet deadlines.
Ability to work effectively in a cross functional team; demonstrated ability to partner with other departments.
Excellent teamwork skills.
Ability to react to change productively.
Ability to program in one or more high-level languages (Examples: Java, Python, JavaScript).
Understanding of networking concepts both in self-managed and cloud environments
Experience with designing and implementing distributed systems (Examples: Kubernetes, Functions-as-a-Service).
Experience implementing with and using Infrastructure-as-Code and Configuration Management tooling (Examples: Terraform, AWS Cloud Formation, Puppet, Ansible, ARM, Google Cloud Deployment Manager).
Experience with DevOps/GitOps tooling (Examples: Jenkins, CircleCI, Bamboo, Gitlab CI, GitHub Actions, GoCD).
Experience with scripting systems (Examples: Python, PERL, Bash, PowerShell)
Experience with Application Performance Monitoring, alerting, notification and reporting tools (Nagios, Prometheus, ELK, New Relic, AppDynamics, OpsGenie).
Experienced and comfortable coordinating live incident calls.
Experienced and comfortable coordinating and documenting Root Cause Analysis investigations.

BECOME A PREMIUM MEMBER TO
UNLOCK FULL JOB DETAILS & APPLY

  • ACCESS TO FULL JOB DETAILS AND APPLICATION INFORMATION
  • HUMAN-SCREENED REMOTE JOBS AND EMPLOYERS
  • COURSES, GROUP CAREER COACHING AND RESOURCE DOWNLOADS
  • DISCOUNTED CAREER SERVICES, RESUME WRITING, 1:1 COACHING AND MORE
  • EXCELLENT CUSTOMER SUPPORT FOR YOUR JOB SEARCH