Staff Machine Learning Engineer

Location: Remote
Compensation: Salary
Reviewed: Mon, Apr 06, 2026
This job expires in: 30 days

Job Summary

A company is looking for a Staff Machine Learning Engineer, GenAI Platform.

Key Responsibilities
  • Drive GenAI infrastructure strategy by proposing and leading the architecture of the LLM platform
  • Design resilient, large-scale distributed systems for fault-tolerant training infrastructure
  • Build self-serve LLM workflows and develop comprehensive evaluation and benchmarking infrastructure


Required Qualifications
  • 10+ years of experience in production software development or building complex distributed data systems
  • Expertise in GenAI/LLM infrastructure and distributed training frameworks
  • Hands-on experience with fault-tolerant, petabyte-scale distributed systems
  • Advanced knowledge of MLOps and modern ML orchestration tools
  • Experience with Kubernetes, Docker, and building production-quality code in Python and/or Go

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...