Driving innovation in model serving and inference architectures, the full-time AI Research Engineer will optimize model deployment strategies for advanced AI systems in a 100% remote environment.

Key responsibilities

Design and deploy model serving architectures that optimize throughput, latency, and memory usage across diverse environments
Build and monitor inference tests in production environments, tracking performance metrics and documenting results against benchmarks
Analyze computational efficiency to identify and resolve bottlenecks in the serving pipeline, ensuring scalability and reliability on resource-constrained systems

Required qualifications

PhD in Computer Science, NLP, Machine Learning, or a related field with a strong track record in AI R&D
Proven experience in low-level kernel optimizations and inference optimization on mobile devices
Deep understanding of model serving architectures and inference optimization techniques
Strong expertise in writing GPU kernels for mobile devices and developing end-to-end inference pipelines
Knowledge of advanced techniques such as Tensor Parallelism and Diffusion Models

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...

Apply

Company Overview

Company Company Name

Headquarters Headquarters

Founded Founded

Website

Wikipedia Wikipedia URL

The company description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...

Apply

AI Research Engineer

Job Summary

Key responsibilities

Required qualifications

COMPLETE JOB DESCRIPTION

Related Jobs

Applied for this Job?