Data Engineer

Location: Remote
Compensation: Hourly
Reviewed: Thu, Feb 19, 2026
This job expires in: 29 days

Job Summary

A company is looking for a Software Engineer 3 to build and scale data pipelines for machine learning model training.

Key Responsibilities:
  • Construct and expand distributed data pipelines for extensive time series and log data
  • Develop high-performance Spark/Python workflows for creating model training datasets
  • Tackle and resolve performance issues related to latency, memory usage, data skew, and throughput
Required Qualifications:
  • Proficiency in Python
  • Comprehensive experience with Apache Spark (PySpark or Scala)
  • Proven track record in building large-scale data pipelines in distributed systems
  • Prior experience in data engineering for machine learning or large-scale model training workflows
  • Familiarity with time series or event-driven data systems is advantageous

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...