Principal AI Architect

Location: Remote
Compensation: To Be Discussed
Reviewed: Thu, May 21, 2026
This job expires in: 30 days

Job Summary

Driving the future of AI infrastructure, the full-time Principal AI Performance Modeling Architect will lead performance modeling and optimization for large-scale ML systems, focusing on GPU architecture specifications to achieve significant performance gains in training and inference pipelines.

Key responsibilities:
  • Lead performance modeling and optimization for multi-trillion parameter LLM training/inference across various modalities
  • Architect memory-efficient training systems utilizing advanced techniques such as structured pruning and quantization
  • Collaborate with internal and external stakeholders to disseminate results and iterate rapidly on innovative solutions
Required qualifications:
  • Extensive experience optimizing large-scale ML systems and GPU architectures
  • Deep expertise in CUDA programming and GPU memory hierarchies
  • Proven track record in architecting distributed training systems for large-scale applications
  • Expert knowledge of transformer architectures and model parallelism techniques
  • Bachelor's, MS, or PhD in Computer Science/Engineering or equivalent industry experience

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...