Senior ML Engineer

Location: Remote

Compensation: Salary

Reviewed: Mon, May 25, 2026

This job expires in: 30 days

Job Category: Research

Employer Type: Employer

Career Level: Experienced, Senior Level

Job Summary

To drive inference optimization at Kimchi, the full-time Senior ML Engineer will enhance throughput, reduce latency, and optimize KV cache utilization while working remotely in a high-autonomy role focused on building efficient ML systems.

Key responsibilities

Push throughput through continuous batching, kernel-level tuning, and optimizing GPU performance
Cut latency by profiling and addressing bottlenecks in compute, memory bandwidth, and scheduling
Quantize models without quality regression and improve KV cache utilization for enhanced throughput

Required qualifications

5+ years of experience building real ML systems with a focus on inference or training infrastructure
Strong proficiency in Python for production services
Hands-on experience with vLLM, SGLang, or TensorRT-LLM, and understanding of inference engine performance
Fluency with quantization tradeoffs and practical experience in distributed systems
A bias toward measurement and self-direction in a broad mandate role

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...

Apply

Company Overview

Company Company Name

Headquarters Headquarters

Founded Founded

Website

The company description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...