Applied Research Engineer
Location: Remote
Compensation: Salary
Reviewed: Wed, Mar 11, 2026
This job expires in: 27 days
Job Summary
A company is looking for an Applied Research Engineer - Training Infra.
Key Responsibilities
- Set up and manage GPU cluster infrastructure on major cloud providers for distributed model training
- Build and operate job orchestration and scheduling systems for managing training and evaluation jobs
- Integrate and maintain ML training frameworks and post-training pipelines for stable operation at scale
Required Qualifications
- Hands-on experience managing GPU clusters on major cloud providers
- Experience with distributed compute orchestration tools such as Kubernetes or Slurm
- Working knowledge of distributed training concepts and memory optimization techniques
- Strong Python proficiency and solid software engineering fundamentals
- Experience with post-training workflows such as supervised fine-tuning is a plus
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...