Remote Jobs Sign In

Senior Scientist in Synthetic Data

Location: Remote
Compensation: Salary
Reviewed: Wed, Jun 10, 2026
This job expires in: 6 days

Job Summary

To advance capabilities in synthetic data generation for training frontier models, the full-time Senior Scientist in Synthetic Data will build pipelines using LLM-based methods, collaborate with various teams, and contribute to open-source libraries in a remote or onsite environment.

Key responsibilities
  • Build synthetic data generation pipelines to enhance the training of LLMs, focusing on reasoning, coding, and multimodal understanding
  • Advance multimodal synthetic data generation in collaboration with NVIDIA's model teams
  • Design and maintain open-source libraries and SDKs, ensuring clean APIs and comprehensive documentation
Required qualifications
  • PhD in Computer Science, Machine Learning, Statistics, or a related field, or equivalent experience
  • 3+ years of research experience in synthetic data generation, generative modeling, or multimodal machine learning
  • Deep technical understanding of LLMs and their data requirements for training and inference
  • Proven track record of developing or maintaining widely-used software libraries
  • Strong publication record at major machine learning and AI conferences

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...