Senior Principal Software Engineer
Location: Remote
Compensation: Salary
Reviewed: Mon, Jun 29, 2026
This job expires in: 26 days
Job Summary
Driving the future of mobility, the full-time Senior Principal Software Engineer will optimize and deploy high-performance LLM inference pipelines, manage inference runtimes across various platforms, and enhance model performance through advanced techniques in a remote setting.
Key responsibilities
- Optimize and deploy high-performance LLM inference pipelines across data center, edge, and embedded platforms
- Implement quantization strategies and optimize key-value cache performance to improve latency and throughput
- Drive latency and throughput improvements that directly impact production products and enable efficient deployment without external vendor dependency
Required qualifications
- Proven experience optimizing ML inference performance in production environments
- Deep understanding of GPU architecture and memory hierarchies
- Hands-on experience with CUDA and low-level performance tuning
- Familiarity with inference engines such as vLLM, TensorRT LLM, and llama.cpp
- Experience with quantization techniques and latency optimization strategies
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...