Director of AI Alignment

Location: Remote

Compensation: Salary

Reviewed: Fri, Jun 12, 2026

This job expires in: 7 days

Job Category: Business Operations

Weekly Hours: Full Time

Employment Status: Permanent

Employer Type: Employer

Career Level: Experienced, Senior Level

Education Level: Doctorate

Job Summary

Leading the alignment and interpretability research agenda for security-domain AI, the full-time remote Director of AI Alignment will develop methods to explain model behavior, detect misuse signals, and establish evaluation frameworks to ensure models operate within intended bounds.

Key responsibilities:

Own and prioritize the alignment and interpretability research agenda, addressing complex open problems in security-domain AI
Build techniques for detecting offensive misuse signals in model internals and collaborate with the adversarial evaluation team
Develop alignment methodologies and evaluation frameworks that ensure models operate safely and effectively

Required qualifications:

MS or PhD in machine learning, computer science, or a related field with research depth in interpretability or AI alignment
8+ years of experience in ML research or engineering, specifically in interpretability or alignment research on large language models
Hands-on expertise with mechanistic interpretability methods applied to real models
Experience designing and conducting alignment evaluations that support meaningful safety claims
Proven track record of leading and developing research teams while actively contributing technically

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...

Apply

Company Company Name

Headquarters Headquarters

Founded Founded

Website

The company description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...