Data Engineer III

Agile DefenseDoral, FL

About The Position

Build Scalable Data & ML Infrastructure Design and implement medallion architecture (Bronze/Silver/Gold) using Databricks or reliable data processing and ML model training Develop automated data pipelines that process structured and unstructured data from multiple sources into analytics-ready formats Create robust ETL/ELT workflows using Apache Spark and modern data engineering practices for both batch and streaming data Build and maintain data quality monitoring and validation systems across the entire data and ML lifecycle Drive ML Platform Excellence Implement MLOps best practices including automated model training, validation, deployment, and monitoring using MLflow and Databricks workflows Design scalable ML inference systems that handle high-volume, low-latency predictions in production environments Create comprehensive monitoring and alerting systems for model performance, data drift, and system health Build self-service ML capabilities that enable data scientists to deploy and monitor their own models efficiently Enable Advanced Analytics & Business Intelligence Design and maintain data models that support both machine learning workloads and business intelligence requirements Create integration points between ML systems and business intelligence platforms (Tableau, PowerBI, Qlik Sense) Implement data governance standards and metadata management systems that ensure data quality and compliance Collaborate with analysts and data scientists to optimize data architecture for both predictive modeling and reporting needs Ensure Data Quality & Governance Implement comprehensive data governance frameworks including data lineage tracking, quality monitoring, and compliance controls Design and maintain data catalogs and metadata management systems that enable efficient data discovery across the organization Establish data quality standards and automated testing frameworks for both analytical and ML workloads Work with stakeholders to define data definitions, business logic, and governance policies Integrate with Enterprise Systems Build integrations with MAVEN Smart Systems (Palantir Foundry) environments to support operational and predictive analytics Connect Databricks-based systems with enterprise data warehouses, streaming platforms, and business applications Implement security and compliance controls that meet enterprise requirements while enabling self-service capabilities Collaborate with platform engineers to integrate ML systems with broader application architecture and infrastructure

Requirements

  • 5+ years of technical experience, including 3+ years building production data pipelines and ML infrastructure using distributed computing platforms like Databricks.
  • Strong data engineering skills in Python, PySpark, and Spark SQL with experience implementing medallion architecture and modern data platform patterns
  • Production ML systems experience including model deployment, monitoring, and MLOps practices in cloud environments
  • Data architecture expertise with experience designing scalable data processing systems and implementing data governance frameworks
  • Experience integrating with platforms such as Qlik, Tableau, PowerBI, MAVEN Smart System (Palantir), or similar.

Responsibilities

  • Design and implement medallion architecture (Bronze/Silver/Gold) using Databricks or reliable data processing and ML model training
  • Develop automated data pipelines that process structured and unstructured data from multiple sources into analytics-ready formats
  • Create robust ETL/ELT workflows using Apache Spark and modern data engineering practices for both batch and streaming data
  • Build and maintain data quality monitoring and validation systems across the entire data and ML lifecycle
  • Implement MLOps best practices including automated model training, validation, deployment, and monitoring using MLflow and Databricks workflows
  • Design scalable ML inference systems that handle high-volume, low-latency predictions in production environments
  • Create comprehensive monitoring and alerting systems for model performance, data drift, and system health
  • Build self-service ML capabilities that enable data scientists to deploy and monitor their own models efficiently
  • Design and maintain data models that support both machine learning workloads and business intelligence requirements
  • Create integration points between ML systems and business intelligence platforms (Tableau, PowerBI, Qlik Sense)
  • Implement data governance standards and metadata management systems that ensure data quality and compliance
  • Collaborate with analysts and data scientists to optimize data architecture for both predictive modeling and reporting needs
  • Implement comprehensive data governance frameworks including data lineage tracking, quality monitoring, and compliance controls
  • Design and maintain data catalogs and metadata management systems that enable efficient data discovery across the organization
  • Establish data quality standards and automated testing frameworks for both analytical and ML workloads
  • Work with stakeholders to define data definitions, business logic, and governance policies
  • Build integrations with MAVEN Smart Systems (Palantir Foundry) environments to support operational and predictive analytics
  • Connect Databricks-based systems with enterprise data warehouses, streaming platforms, and business applications
  • Implement security and compliance controls that meet enterprise requirements while enabling self-service capabilities
  • Collaborate with platform engineers to integrate ML systems with broader application architecture and infrastructure

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

501-1,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service