Technical Lead - AI Inferences
Location: Remote
Compensation: To Be Discussed
Reviewed: Thu, May 07, 2026
This job expires in: 25 days
Job Summary
A company is looking for a Technical Lead - AI Inferences.
Key Responsibilities
- Architect and oversee the deployment of high-throughput, low-latency LLM inference pipelines
- Mentor and lead a small team of developers, conducting code reviews and sprint planning
- Implement and evaluate KV cache management solutions to optimize inference processes
Required Qualifications
- Proven experience with KV cache reuse and continuous batching in AI inference
- Deep familiarity with serving frameworks like vLLM, LMCache, and NIXL
- Expertise in backend engineering with proficiency in Python, C++, or Rust
- Experience with CUDA and GPU memory management
- Familiarity with Kubernetes for scaling GPU workloads
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...