Head of Inference

Location: Remote
Compensation: To Be Discussed
Reviewed: Wed, May 06, 2026
This job expires in: 24 days

Job Summary

A company is looking for a Head of Inference.

Key Responsibilities
  • Create the inference strategy and define the inference architecture for Edge AI
  • Own the inference serving layer end-to-end and build a credible proof of concept
  • Build distributed inference pipelines and set performance baselines for inference latency and throughput


Required Qualifications
  • Experience with production inference serving technologies such as vLLM, TensorRT-LLM, or Triton Inference Server
  • Deep knowledge of quantization, containerization, and cost-per-token optimization
  • Proficiency in C++/CUDA/Rust and GPU utilization optimization
  • Experience with systems engineering and technical leadership
  • Startup experience with a focus on rapid execution and clear communication

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...