Senior Data Engineer - Full Stack

Codvo.ai•New York, NY

16h•Hybrid

About The Position

We are seeking a highly skilled Senior Data Engineer – Full Stack to build and maintain internal tools, automation frameworks, and workflows that enhance the efficiency, reliability, and scalability of our data and machine learning platforms. This role will work closely with Data Engineers, Data Scientists, and ML Engineers to streamline operations across the data lifecycle.

Requirements

Strong experience in Python and scripting for automation and backend development
Hands-on experience with Databricks platform and ecosystem
Experience with APIs, Terraform, and/or Databricks SDK for automation
Solid understanding of ETL/ELT pipelines and data platform architecture
Experience building testing frameworks for data pipelines and ML workflows
Familiarity with CLI tool development and system automation
Knowledge of MLOps principles and practices
Experience with modern development practices, including Spec-driven development
Experience with modern development practices, including use of coding agents or automation-assisted development tools
Experience with modern development practices, including version control and CI/CD pipelines
8+ years of experience in Data Engineering, Platform Engineering, or related roles
Experience working in data-driven or ML-focused environments

Nice To Haves

Experience building dashboards or internal tools using React, Streamlit, or similar frameworks
Familiarity with Databricks AI/BI or other data visualization tools
Exposure to data governance and metadata management frameworks
Experience working with cloud platforms (AWS preferred)

Responsibilities

Design and develop CLI tools, scripts, and internal utilities to automate repetitive tasks across the data platform
Automate pipeline execution and orchestration
Automate data governance workflows
Automate metadata synchronization
Automate environment setup and configuration
Develop test harnesses
Automate workflows on Databricks, including job deployment and scheduling
Automate environment provisioning on Databricks
Automate MLOps processes using APIs, Terraform, or Databricks SDK
Build and implement robust testing frameworks for integration testing for pipelines
Build and implement robust testing frameworks for end-to-end validation of ETL/ELT workflows
Build and implement robust testing frameworks for testing and validation for ML inference workflows
Improve overall productivity, scalability, and reliability of the data and ML engineering ecosystem
Develop lightweight internal tools and dashboards using frameworks such as React, Streamlit, or similar technologies
Visualize data pipelines and workflows using internal tools
Demonstrate model inference capabilities using internal tools
Provide configuration and operational controls using internal tools
Enable internal productivity monitoring and dashboards
Collaborate with cross-functional teams to identify automation opportunities and implement best practices