AI Inference Engineer
Location: Remote
Compensation: Salary
Reviewed: Wed, Feb 18, 2026
This job expires in: 28 days
Job Summary
A company is looking for a Forward Deployed Engineer, AI Inference (vLLM and Kubernetes).
Key Responsibilities
- Deploy and configure LLM-D and vLLM on Kubernetes clusters, optimizing advanced deployment strategies
- Run performance benchmarks and tune parameters to ensure optimal latency and throughput in production environments
- Collaborate with customer engineers to write production-quality code that integrates inference engines into existing ecosystems
Required Qualifications
- 8+ years of experience in Backend Systems, SRE, or Infrastructure Engineering
- Deep expertise in Kubernetes, including custom resources and high-performance networking
- Proficiency in Python and Go programming languages
- Experience with Infrastructure as Code tools like Helm or Terraform
- Familiarity with AI inference concepts and deployment on cloud and GPU hardware
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...