Remote Jobs Sign In

Senior Inference Engineer

Location: Remote
Compensation: Salary
Reviewed: Sat, Jun 13, 2026
This job expires in: 26 days

Job Summary

To advance AIConfigurator, the full-time Senior Inference Engineer will build and optimize deployment configurations for large-scale LLM inference, integrating GPU systems and collaborating with various teams to enhance performance on NVIDIA platforms, with options for remote work.

Key responsibilities
  • Build and evolve AIConfigurator's core optimization engine for LLM serving, focusing on configuration search and efficiency estimation
  • Develop production-quality Python/Rust APIs and workflows to assist users in generating deployment configurations for NVIDIA GPU clusters
  • Collaborate with performance and benchmarking teams to ensure alignment between simulated results and actual deployment performance
Required qualifications
  • BS, MS, or PhD in Computer Science, Computer Engineering, Electrical Engineering, Applied Math, or a related field, or equivalent experience
  • 10+ years of relevant software engineering experience
  • Strong Python/Rust engineering skills, including experience with production APIs and maintainable software development
  • Experience with GPU computing, distributed systems, or high-performance model serving
  • Understanding of LLM inference concepts such as batching, latency, and parallelism strategies

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...