AI Inference Engineer
Location: Remote
Compensation: To Be Discussed
Reviewed: Wed, May 27, 2026
This job expires in: 30 days
Job Summary
Owning the inference backbone for QVAC's local AI stack, the full-time AI Inference Engineer will work remotely to enhance C++ systems for efficient model deployment on edge devices, focusing on runtime stability and performance optimization.
Key responsibilities
- Deploy machine learning models to edge devices using frameworks like llama.cpp, ggml, and ONNX
- Collaborate with researchers to transition models from research to production environments
- Integrate AI features into existing products, enhancing them with the latest advancements in machine learning
Required qualifications
- Excellent programming skills in C++, with experience in JavaScript as a bonus
- Strong experience with Llama.cpp and ggml inference engines for deploying models on specific GPU architectures
- Good understanding of deep learning concepts and model architectures
- Experience with transformers, LLMs, and diffusion models
- A degree in Computer Science, AI, Machine Learning, or a related field, with a solid track record in AI R&D
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...