Senior Inference Engineer
Location: Remote
Compensation: Salary
Reviewed: Sat, Jun 13, 2026
This job expires in: 26 days
Job Summary
To advance AIConfigurator, the full-time Senior Inference Engineer will build and optimize deployment configurations for large-scale LLM inference, integrating GPU systems and collaborating with various teams to enhance performance on NVIDIA platforms, with options for remote work.
Key responsibilities
- Build and evolve AIConfigurator's core optimization engine for LLM serving, focusing on configuration search and efficiency estimation
- Develop production-quality Python/Rust APIs and workflows to assist users in generating deployment configurations for NVIDIA GPU clusters
- Collaborate with performance and benchmarking teams to ensure alignment between simulated results and actual deployment performance
Required qualifications
- BS, MS, or PhD in Computer Science, Computer Engineering, Electrical Engineering, Applied Math, or a related field, or equivalent experience
- 10+ years of relevant software engineering experience
- Strong Python/Rust engineering skills, including experience with production APIs and maintainable software development
- Experience with GPU computing, distributed systems, or high-performance model serving
- Understanding of LLM inference concepts such as batching, latency, and parallelism strategies
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...