Data Infrastructure Engineer

Job is Expired
Location: Remote
Compensation: Salary
Reviewed: Wed, Aug 27, 2025

Job Summary

A company is looking for a Data Infrastructure Engineer to design and maintain distributed data systems for AI model training.

Key Responsibilities
  • Design and maintain distributed ingestion pipelines for structured and unstructured data
  • Support preprocessing of unstructured assets for training pipelines and implement validation checks
  • Architect pipelines across cloud storage and optimize large-scale processing with distributed frameworks
Required Qualifications
  • 5+ years of experience in data engineering or distributed systems
  • Strong programming skills in Python; familiarity with Scala/Java/C++ is a plus
  • Proficiency with distributed frameworks such as Spark, Dask, or Ray
  • Experience with cloud platforms (AWS/GCP/Azure) and storage systems
  • Familiarity with workflow orchestration tools like Airflow or Prefect

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...