Principal AI Architect
Location: Remote
Compensation: To Be Discussed
Reviewed: Thu, May 21, 2026
This job expires in: 30 days
Job Summary
Driving the future of AI infrastructure, the full-time Principal AI Performance Modeling Architect will lead performance modeling and optimization for large-scale ML systems, focusing on GPU architecture specifications to achieve significant performance gains in training and inference pipelines.
Key responsibilities:
- Lead performance modeling and optimization for multi-trillion parameter LLM training/inference across various modalities
- Architect memory-efficient training systems utilizing advanced techniques such as structured pruning and quantization
- Collaborate with internal and external stakeholders to disseminate results and iterate rapidly on innovative solutions
Required qualifications:
- Extensive experience optimizing large-scale ML systems and GPU architectures
- Deep expertise in CUDA programming and GPU memory hierarchies
- Proven track record in architecting distributed training systems for large-scale applications
- Expert knowledge of transformer architectures and model parallelism techniques
- Bachelor's, MS, or PhD in Computer Science/Engineering or equivalent industry experience
COMPLETE JOB DESCRIPTION
The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...