Data Engineer

Job is Expired
Location: Remote
Compensation: To Be Discussed
Reviewed: Tue, Apr 08, 2025
Python Spark Airflow Kubernetes

Job Summary

A company is looking for a Data Engineer to build data pipelines for next-generation generative video models.

Key Responsibilities
  • Build scalable, high-throughput data pipelines optimized for multi-modal video model training
  • Develop systems for data ingestion, deduplication, quality assessment, validation, filtering, and labeling
  • Optimize distributed data processing frameworks and collaborate with infrastructure teams to scale pipelines across thousands of GPUs
Required Qualifications
  • Deep experience in building and scaling data infrastructure for large-scale ML systems, ideally for video or multi-modal models
  • Solid background in ML engineering with hands-on experience in training and optimizing classifiers
  • Experience managing large-scale datasets and pipelines in production
  • Expertise in Python, Spark, Airflow, or similar data frameworks
  • Understanding of modern infrastructure such as Kubernetes, Terraform, object stores, and distributed computing environments
FREE TOOLS
Unlock Expert Career Tools

Register free for worksheets, guides, and on-demand coaching to support your job search.

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...