Cloud Data Engineer

Detroit TigersDetroit, MI
1dRemote

About The Position

The Detroit Tigers are seeking a Cloud Data Engineer, Baseball Systems. This role will be responsible for designing, managing, and automating data processes across our data architecture to support Baseball Operations initiatives, including the deployment and operationalization of machine learning models. This position will report to the Manager, Baseball Systems Data.

Requirements

  • Proficiency building data processing pipelines using SQL and Python.
  • Experience with cloud computing, cloud storage, and cloud services.
  • Experience with cloud-based data lakes, data warehouses, and related tooling.
  • Strong understanding of data strategies and practices, such as continuous integration, regression testing, and versioning.
  • Experience building, maintaining, and querying SQL data warehouses built for data science and analytics.
  • Familiarity with MLOps concepts and tooling, including model serving, monitoring, and pipeline orchestration.

Nice To Haves

  • Understanding of data quality frameworks and best practices for implementation.
  • Familiarity with baseball and with current baseball research.
  • Experience using Apache Spark (Databricks on Azure preferred).
  • Experience with Airflow or similar workflow orchestration tools.
  • Effective communication skills with an ability to explain technical concepts to developers and business partners.
  • Experience with DevOps and MLOps practices for CI/CD pipelines, including model versioning and experiment tracking.
  • Experience working with containers and container deployment, including containerized model serving.
  • Familiarity with open-source data quality frameworks.

Responsibilities

  • Design, implement, and maintain our data architecture and processing pipelines at scale.
  • Design, implement, and use data quality assurance frameworks to support the process of identifying inconsistent data patterns.
  • Collaborate with Tigers data engineers and data scientists to implement good data hygiene practices and procedures in our data processes.
  • Work with external data vendors to triage and remedy data quality issues.
  • Automate and execute test cases in data pipelines and manage data issue tracking.
  • Build and maintain MLOps infrastructure to support the deployment, monitoring, and retraining of machine learning models in production.
  • Partner with data scientists to productionize models, ensuring reproducibility, scalability, and reliability across the ML lifecycle.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

251-500 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service