AI Inference Engineer

Location: Remote
Compensation: Salary
Reviewed: Wed, Feb 18, 2026
This job expires in: 28 days

Job Summary

A company is looking for a Forward Deployed Engineer, AI Inference (vLLM and Kubernetes).

Key Responsibilities
  • Deploy and configure LLM-D and vLLM on Kubernetes clusters, optimizing advanced deployment strategies
  • Run performance benchmarks and tune parameters to ensure optimal latency and throughput in production environments
  • Collaborate with customer engineers to write production-quality code that integrates inference engines into existing ecosystems
Required Qualifications
  • 8+ years of experience in Backend Systems, SRE, or Infrastructure Engineering
  • Deep expertise in Kubernetes, including custom resources and high-performance networking
  • Proficiency in Python and Go programming languages
  • Experience with Infrastructure as Code tools like Helm or Terraform
  • Familiarity with AI inference concepts and deployment on cloud and GPU hardware

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...