Model Serving Engineer

Location: Remote

Compensation: Salary

Reviewed: Mon, Jun 08, 2026

This job expires in: 30 days

Job Category: Information Technology

Weekly Hours: Full Time

Employment Status: Independent Contractor

Employer Type: Employer

Career Level: Experienced

Education Level: Bachelors, Masters

Job Summary

Joining a dynamic team, the full-time remote Model Serving Engineer will design and operate high-performance inference platforms for serving large machine learning models, focusing on systems engineering aspects such as request routing, batching, and end-to-end observability.

Key Responsibilities

Design and operate model serving platforms for diverse workloads including LLMs and vision models
Optimize inference performance through techniques like continuous batching and request multiplexing
Implement autoscaling and capacity management systems to balance latency, throughput, and cost

Required Qualifications

Bachelor's or Master's degree in Computer Science or a related field
Six or more years of experience in distributed systems or ML platform engineering
Strong proficiency in Python and a systems language such as Go, Rust, or C++
Deep experience operating high-throughput, low-latency services in production
Hands-on experience with LLM or large model inference frameworks like vLLM or TensorRT-LLM

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...

Apply

Company Company Name

Headquarters Headquarters

Founded Founded

Website

The company description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...