Applied Research Engineer

Location: Remote
Compensation: Salary
Reviewed: Wed, Mar 11, 2026
This job expires in: 27 days

Job Summary

A company is looking for an Applied Research Engineer - Training Infra.

Key Responsibilities
  • Set up and manage GPU cluster infrastructure on major cloud providers for distributed model training
  • Build and operate job orchestration and scheduling systems for managing training and evaluation jobs
  • Integrate and maintain ML training frameworks and post-training pipelines for stable operation at scale
Required Qualifications
  • Hands-on experience managing GPU clusters on major cloud providers
  • Experience with distributed compute orchestration tools such as Kubernetes or Slurm
  • Working knowledge of distributed training concepts and memory optimization techniques
  • Strong Python proficiency and solid software engineering fundamentals
  • Experience with post-training workflows such as supervised fine-tuning is a plus

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...