Lead Data Engineer

Deliberate SolutionsNew York / Boston, NY
$160,000 - $220,000Hybrid

About The Position

A patient wears an Oura ring to sleep. Their phone picks up a shift in activity patterns overnight. The next morning, a conversational AI agent conducts a brief voice -based check -in — and the vocal features, facial action units, and linguistic markers from that session all flow into the same clinical picture alongside the wearable data. Your job is to make sure every one of those signals — from raw sensor stream to clinically meaningful feature — arrives reliably, on time, and at quality. You'll architect and own the data infrastructure across all clinical data modalities: audio -visual features from conversational assessments, wearable biometrics, passive mobile sensing, and the feature pipelines that prepare them for fusion in our multimodal ML models. You'll also own the overall data architecture — how data flows into and through Deliberate AI, how it's stored, cataloged, and governed, and how it scales as we deploy across clinical trial sites on four continents. This isn't just a pipeline -building role; it's defining the technical strategy for how clinical data works at a company building the future of precision mental health care.

Requirements

  • Have significant experience in data engineering, including hands -on work in at least two of: audio/video processing, IoT/wearables, or mobile sensing
  • Have expert -level programming skills in Python with experience in performance optimization
  • Have a proven track record architecting and scaling data pipelines for multimedia or sensor data
  • Have deep experience with wearable device APIs (e.g. Fitbit, Oura, Apple Health)
  • Bring strong expertise in time -series data processing, real -time streaming architectures, and feature engineering
  • Have experience with cloud infrastructure (GCP / AWS) and distributed computing
  • Use agentic programming tools (e.g., Claude Code, Codex) as part of your workflow
  • Have a strong understanding of signal processing fundamentals across multiple modalities
  • Hold a Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience; Master's preferred)

Nice To Haves

  • Have experience with healthcare or clinical research data (HIPAA compliance, PHI handling)
  • Have knowledge of affective computing, or speech processing
  • Bring background in real -time streaming architectures (Kafka, Pub/Sub, WebSockets) and distributed computing frameworks (Spark, Dask)
  • Have experience with machine learning for audio, video, or sensor applications
  • Have publications or open -source contributions in data engineering or digital health

Responsibilities

  • Design and implement the overall data architecture for ingestion, storage, cataloging, and governance of all clinical datasets — audio, video, wearable, mobile sensing, and physiological data from clinical sites worldwide
  • Build and maintain API integrations with commercial wearable devices (e.g. Oura Ring, Fitbit) to collect raw sensor streams (HRV, sleep stages, activity, heart rate) and engineer biometric features
  • Develop systems to capture and process passive mobile signals that trigger adaptive assessments, including real -time streaming and synchronization across modalities
  • Build automated QA systems to detect missing data, sensor failures, and anomalous readings — with data lineage tracking, pipeline observability, monitoring, alerting, and incident triage so problems are caught and resolved before they affect downstream models or clinical decisions
  • Design participant monitoring systems with automated data completeness checks, device health monitoring, and alert mechanisms supporting global deployment
  • Implement reliable incremental load patterns — idempotent runs, backfill strategies, and late -arriving data handling — so the platform stays correct as clinical sites come online across time zones and connectivity conditions
  • Evaluate and select the core data stack — orchestration, warehousing, transformation, and observability tooling — and own those decisions as the foundation the team builds on

Benefits

  • Early -stage equity with meaningful ownership — you're joining at a stage where individual grants are substantial
  • Comprehensive health, dental, and vision insurance
  • 401(k) with company match
  • Flexible PTO policy
  • Publication co -authorship on peer -reviewed clinical research — your data architecture shows up in the scientific record, not just the git log
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service