AI Inference Engineer

Location: Remote

Compensation: Salary

Reviewed: Wed, Feb 18, 2026

This job expires in: 28 days

Job Category: Information Technology

Weekly Hours: Full Time

Employment Status: Permanent

Employer Type: Employer

Career Level: Experienced

Job Summary

A company is looking for a Forward Deployed Engineer, AI Inference (vLLM and Kubernetes).

Key Responsibilities

Deploy and configure LLM-D and vLLM on Kubernetes clusters, optimizing advanced deployment strategies
Run performance benchmarks and tune parameters to ensure optimal latency and throughput in production environments
Collaborate with customer engineers to write production-quality code that integrates inference engines into existing ecosystems

Required Qualifications

8+ years of experience in Backend Systems, SRE, or Infrastructure Engineering
Deep expertise in Kubernetes, including custom resources and high-performance networking
Proficiency in Python and Go programming languages
Experience with Infrastructure as Code tools like Helm or Terraform
Familiarity with AI inference concepts and deployment on cloud and GPU hardware

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...

Apply

Company Overview

Company Company Name

Headquarters Headquarters

Founded Founded

Website

Wikipedia Wikipedia URL

BBB URL BBB URL

The company description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...