Remote Jobs Sign In

Director of AI Alignment

Location: Remote
Compensation: Salary
Reviewed: Fri, Jun 12, 2026
This job expires in: 7 days

Job Summary

Leading the alignment and interpretability research agenda for security-domain AI, the full-time remote Director of AI Alignment will develop methods to explain model behavior, detect misuse signals, and establish evaluation frameworks to ensure models operate within intended bounds.

Key responsibilities:
  • Own and prioritize the alignment and interpretability research agenda, addressing complex open problems in security-domain AI
  • Build techniques for detecting offensive misuse signals in model internals and collaborate with the adversarial evaluation team
  • Develop alignment methodologies and evaluation frameworks that ensure models operate safely and effectively
Required qualifications:
  • MS or PhD in machine learning, computer science, or a related field with research depth in interpretability or AI alignment
  • 8+ years of experience in ML research or engineering, specifically in interpretability or alignment research on large language models
  • Hands-on expertise with mechanistic interpretability methods applied to real models
  • Experience designing and conducting alignment evaluations that support meaningful safety claims
  • Proven track record of leading and developing research teams while actively contributing technically

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...