Job Summary
A software development company is seeking a Telecommute Machine Learning Infrastructure Software Engineer.
Individual must be able to fulfill the following responsibilities:
- Design and build scalable and reliable infrastructure to support model training efforts
- Optimize model training and inference performance for large datasets
- Develop and implement automated testing and monitoring frameworks
Qualifications for this position include:
- Bachelor's or Master's degree in Computer Science or a related field
- 4+ years of professional experience in software engineering, with a focus on machine learning infrastructure
- Experience with containerization and container orchestration tools
- Experience with accelerated and distributed model training
- Experience implementing model versioning, monitoring and observability systems
- Solid understanding of fundamentals in statistics and deep learning