Senior Data Engineer

Tebra
6hRemote

About The Position

As a Senior Data Engineer focused on AI/ML, you'll architect, build, and operate the specialized data infrastructure that powers Tebra’s intelligent features. You will serve as a technical subject matter expert in data systems, partnering closely with Machine Learning Engineers to transform raw, messy healthcare data into high-quality training sets and real-time inference features. This is a hands-on role where you will own large data sub-systems, translating business requirements into software solutions that accelerate our ability to deploy AI. You’ll tackle technical challenges head-on—from data versioning to feature serving—ensuring our ML models are fed by reliable, scalable, and performant pipelines.

Requirements

  • 5+ years of professional software development experience.
  • Deep technical subject matter expertise in 3+ general areas of software development (e.g., Big Data Processing, Distributed Systems, Data Modeling).
  • 3+ years of hands-on experience in Data Engineering with a focus on supporting analytics or data science teams.
  • Advanced proficiency in Python and SQL. You are comfortable writing production-grade code for data transformation and orchestration (not just scripts).
  • Proven ability to architect and write software that enables ML at scale—moving beyond simple ETL to building robust data platforms.
  • Strong background in modern data infrastructure relevant to AI (e.g., Spark, Airflow, Kafka, Vector Databases).
  • Experience with Data Lake/Lakehouse architectures (e.g., Databricks, Snowflake, Delta Lake) and understanding how to structure data for efficient model training.
  • Familiarity with MLOps concepts: You understand the difference between a training set and a test set, and you know what "data leakage" is and how to prevent it in the pipeline.
  • Proven ability to deploy and maintain data systems in production with CI/CD, monitoring, and alerting.
  • Excellent technical communication and a product mindset—comfortable driving initiatives from concept to delivery.

Nice To Haves

  • Background in healthcare software operations or working with structured business data.
  • Experience implementing or managing a Feature Store (e.g., Feast, Tecton).
  • Familiarity with Data Versioning Control tools (e.g., DVC, LakeFS).
  • Published research or conference papers in data engineering, distributed systems, or machine learning.
  • Experience with retrieval-augmented generation (RAG) pipelines or vector search infrastructure.
  • Contributions to open-source data or ML infrastructure projects.

Responsibilities

  • Architect and write software that solves complex business problems, specifically designing scalable pipelines for feature extraction, training data generation, and model monitoring logs.
  • Own and serve as a Subject Matter Expert (SME) for large software systems, such as the organization's Feature Store or Data Lakehouse, ensuring data availability for both experimentation and production inference.
  • Continuously monitor data pipelines in production, detect data drift or quality anomalies, and implement automated recovery systems to ensure the reliability and freshness of features and training data over time.
  • Lead Engineering Design Reviews, providing well-articulated and reasoned explanations for architecture decisions (e.g., choosing between batch processing for training vs. real-time streaming for inference).
  • Write software frameworks that can be extended by others on the team, such as automated data quality checks and schema validation tools that prevent training-serving skew.
  • Translate business requirements into software solutions, bridging the gap between raw data sources and the structured inputs needed for advanced ML models.
  • Know when and how to optimize complex code, specifically tuning Spark jobs or SQL queries to handle massive datasets required for Large Language Model (LLM) fine-tuning or deep learning.
  • Collaborate cross-functionally including ML engineers to implement MLOps best practices, including data versioning, lineage tracking, and reproducibility.
  • Expert at scoping tasks, breaking down complex data infrastructure initiatives into manageable deliverables for the squad.

Benefits

  • United States: In addition to our healthcare benefits, we also offer amazing perks! Need work from home basics? We offer a discount through Dell! We also offer a number of resources to help you keep your mind and body healthy. Check out Gympass for a great workout, or TelusEmployee Assistance Program to find mental health resources, along with other resources for everyday occurrences.
  • Costa Rica: To assist with all of life’s needs, Tebra also offers a wellness and childcare subsidy and a University/Education discount! We also offer a number of resources to help you keep your mind and body healthy. Check out Gympass for access to health and fitness apps, or Telus Employee Assistance Program to find mental health resources, along with other resources for everyday occurrences.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service