Head of Inference

Location: Remote

Compensation: To Be Discussed

Reviewed: Wed, May 06, 2026

This job expires in: 24 days

Job Category: Information Technology

Weekly Hours: Full Time

Employer Type: Employer

Job Summary

A company is looking for a Head of Inference.

Key Responsibilities

Create the inference strategy and define the inference architecture for Edge AI
Own the inference serving layer end-to-end and build a credible proof of concept
Build distributed inference pipelines and set performance baselines for inference latency and throughput

Required Qualifications

Experience with production inference serving technologies such as vLLM, TensorRT-LLM, or Triton Inference Server
Deep knowledge of quantization, containerization, and cost-per-token optimization
Proficiency in C++/CUDA/Rust and GPU utilization optimization
Experience with systems engineering and technical leadership
Startup experience with a focus on rapid execution and clear communication

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...

Apply

Company Overview

Company Company Name

Headquarters Headquarters

Founded Founded

Website

The company description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...