Designing and building large-scale data pipelines using Python, PySpark, and Scala
Performance tuning and optimization of Spark jobs in GCP environments
Developing and maintaining production-ready workflows using orchestration tools like Kubeflow Pipelines or Airflow

Required Qualifications

Bachelor's degree in Computer Science, Engineering, or a related field or equivalent experience
5-20 years of experience in data engineering with a focus on Python, PySpark, and/or Scala
Expertise in Spark performance tuning and optimization in cloud-based environments (preferably GCP)
Hands-on experience with workflow orchestration tools like KFP or Airflow
Proficiency with Docker and container-based deployment strategies

COACHING

Free Coaching - Learn From the Pros

Get access to expert-led coaching session recordings on AI, resumes, LinkedIn, interviewing, and more when you register.

COMPLETE JOB DESCRIPTION

The job description is available to subscribers. Subscribe today to get the full benefits of a premium membership with Virtual Vocations. We offer the largest remote database online...

Job is Expired

Lead Data Engineer

Job Summary

Key Responsibilities

Required Qualifications

COMPLETE JOB DESCRIPTION

Company Overview

Related Jobs!

Applied for this Job?