Partial Telecommute: Linux HPC Cluster Administrator
DESCRIPTION:
Candidate will be responsible for supporting the Geophysical Laboratory's Linux High-Performance Computational (HPC) clusters in a non-profit scientific setting located in the NW neighborhood of Washington DC. The candidate will play a key role in improving the security, performance, and reliability of the Laboratory's high-performance computing infrastructure.
SUMMARY OF DUTIES:
Install, configure, and maintain cluster hardware and system software
Install, configure, optimize and maintain cluster-aware scientific applications and compilers
Diagnose hardware and system operational problems quickly and effectively, automate problem reporting
Coordinate with vendors to resolve hardware and software problems
Port cluster management tools
Setup reliable and efficient backups for all clusters
Implement best practices and document procedures for all cluster related tasks
MINIMUM REQUIREMENTS:
Strong working knowledge of Linux
Experience with managing and supporting large scale Linux installations
Experience with managing and maintaining high-performance computational clusters
Shell scripting and programming experience
Ability to work well with others and communicate effectively using terminology that staff and other users can understand
Ability to respond to issues quickly, multi-task and meet deadlines
Must be eligible to work in the United States.
DESIRED EXPERIENCE:
Experience with high-performance networks
Experience with the cluster file systems
Experience with batch schedulers and queuing systems
Experience installing and maintaining clustered environments, including
automated installation methods
Experience with building HPC clusters
C and Fortran programming and porting experience
SALARY:
This is a part-time or contract position, salary will be based on experience. We need someone who works with Linux clusters all the time and doesn't mind getting paid for doing tasks that they do on a daily basis and enjoy doing it. You will be able to work after hours and telecommute when appropriate. The applicant should be able to provide on-call support and off-hours emergency maintenance when necessary.
- We are looking for people in the Washington DCmetropolitan area.
Employer Posted:Monday, April 14, 2008






