Senior ML Engineer

Location: Remote

Compensation: Salary

Reviewed: Mon, May 25, 2026

This job expires in: 30 days

Job Category: Research

Employer Type: Employer

Career Level: Experienced, Senior Level

Job Summary

Driving inference optimization, the full-time Senior ML Engineer will enhance throughput, reduce latency, and improve KV cache utilization in a remote setting, leveraging advanced machine learning techniques and a deep understanding of both hardware and software systems.

Key responsibilities

Push throughput through continuous batching and kernel-level tuning across various ML frameworks
Cut latency by profiling and addressing bottlenecks in compute, memory, and networking
Quantize models without quality regression and optimize KV cache strategies for enhanced performance

Required qualifications

5+ years of experience building real ML systems with a focus on inference or training infrastructure
Strong proficiency in Python for production services
Hands-on experience with vLLM, SGLang, or TensorRT-LLM
Fluency in quantization tradeoffs and experience measuring quality regressions
Comfort with distributed systems, including multi-GPU and multi-node setups

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...

Apply

Company Overview

Company Company Name

Headquarters Headquarters

Founded Founded

Website

The company description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...