Model Performance Engineer

Location: Remote
Compensation: To Be Discussed
Reviewed: Thu, May 29, 2025
This job expires in: 11 days
PyTorch TensorRT VLLM

Job Summary

A company is looking for a Model Performance Engineer to optimize inference performance for AI models on their platform.

Key Responsibilities
  • Optimize inference performance by minimizing latency and maximizing throughput
  • Experiment continuously to achieve industry-leading performance for various models
  • Impact the performance of applications serving millions of users globally
Required Qualifications
  • Experience with state-of-the-art inference stacks such as PyTorch, TensorRT, or vLLM
  • Open to candidates with any level of experience, including new graduates
  • Ability to work in a fast-paced environment and adapt to new challenges
  • Willingness to work in-person in New York City or remotely if exceptionally qualified
  • Visa sponsorship available for qualified candidates
FREE TOOLS
Unlock Expert Career Tools

Register free for worksheets, guides, and on-demand coaching to support your job search.

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...