Technical Lead - AI Inferences

Location: Remote

Compensation: To Be Discussed

Reviewed: Thu, May 07, 2026

This job expires in: 25 days

Job Category: Information Technology

Employer Type: Employer

Job Summary

A company is looking for a Technical Lead - AI Inferences.

Key Responsibilities

Architect and oversee the deployment of high-throughput, low-latency LLM inference pipelines
Mentor and lead a small team of developers, conducting code reviews and sprint planning
Implement and evaluate state-of-the-art KV cache management solutions to optimize inference

Required Qualifications

Proven experience with KV cache reuse, speculative decoding, and continuous batching
Deep familiarity with serving frameworks such as vLLM, LMCache, and NIXL
Expertise in backend engineering with Python, C++, or Rust, and knowledge of CUDA and GPU memory management
Experience with Kubernetes (K8s) for scaling GPU workloads

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...

Apply

Company Overview

Company Company Name

Headquarters Headquarters

Founded Founded

Website

The company description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...