Optimize inference performance by minimizing latency and maximizing throughput
Experiment continuously to achieve industry-leading performance for various models
Impact the performance of applications serving millions of users globally

Required Qualifications

Experience with state-of-the-art inference stacks such as PyTorch, TensorRT, or vLLM
Open to candidates with any level of experience, including new graduates
Ability to work in a fast-paced environment and adapt to new challenges
Willingness to work in-person in New York City or remotely if exceptionally qualified
Visa sponsorship available for qualified candidates

FREE TOOLS

Unlock Expert Career Tools

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...

Apply

Model Performance Engineer

Job Summary

Key Responsibilities

Required Qualifications

COMPLETE JOB DESCRIPTION

Company Overview

Related Jobs!

Applied for this Job?