Senior Data Engineer

Hinge Health•San Francisco, CA

About The Position

As a Senior Data Engineer, you will own the data infrastructure that powers real-time experiences for our members. You will build and scale pipelines that move data from dozens of upstream services [across Kafka event streams and transactional databases] into a unified data platform that serves both real-time APIs and analytical workloads on Databricks.Your work will directly enable AI-powered coaching assistants and physical therapy tools that use live member data [including engagement logs and clinical data] to generate personalized recommendations. You will work at the intersection of data engineering and AI, building the reliable, low-latency data foundation that these systems depend on. You will work in a modern stack: Python, Flink and PySpark for pipeline development, Kafka for event streaming, Delta Lake for scalable storage, and Aurora PostgreSQL for operational data. This is a high-ownership role. You will work closely with application engineers, data scientists, and AI teams across the organization, defining how data flows from the moment it is created to the moment it is consumed. You will also help establish the standards and practices that enable product teams to take ownership of their own data in a HIPAA-compliant environment. If you are excited about building the data infrastructure behind AI systems that have a direct impact on people's health, this role is for you. Our tech stack: Python, SQL, dbt, Airflow, PostgreSQL, MySQL, REST, Aptible, Docker, Tonic.ai, Terraform, Spark, Kafka, Flink, Fivetran, Databricks, AWS (S3, Lambda, Kinesis, RDS, Glue).

Requirements

Bachelor’s Degree in Computer Science or related technical degree
5+ years of data engineering experience with a proven track record of building and owning reliable production pipelines
5+ years of strong proficiency in Python and SQL
3+ years of experience in processing and storing large scale data using distributed systems as well as a mastery of database designs and data modeling including star and snowflake schemas.
2+ years of experience working with broad spectrum of data stores like PostgreSQL, MySQL, MongoDB, Redis, databricks and Redshift
3+ years of experience deploying and operating pipelines in the cloud including CI/CD, monitoring, and incident response
3+ years of experience building streaming and batch pipelines using tools such as Spark, Kafka, Flink, and Airflow
1+ year experience working with dbt
1+ years of experience with AI tools for code generation (cursor, claude code)
1+ years of experience with orchestration tools (Airflow, Prefect, or Dagster) including scheduling, retries, alerting, and SLAs.

Nice To Haves

Track record of improving data reliability: handling schema drift, late data, backfills, and operational incidents.
Experience with big data technologies such as: Hadoop, Hive, Spark, EMR
Proven success in communicating with users, other technical teams, and senior management to collect requirements, describe data modeling decisions and data engineering strategy
Understanding of MLOps/LLMOps principles to ensure the scalable and reliable deployment of text processing and embedding pipelines
Background in analytics engineering (dbt, SQLMesh, Dataform) or strong understanding of data consumer needs

Responsibilities

Design and build the data foundation for AI-powered health experiences and decision sciences - Build and own data pipelines across both streaming and batch paradigms, from ingestion to serving. You will build ingestion layers that consume from Kafka event streams and transactional databases, write transformations using Flink, Databricks, and dbt, and make deliberate decisions about where transformations belong and why. You will own pipelines across the full stack: raw ingestion, normalized staging, aggregated analytical models, and the serving layer that downstream consumers including real-time APIs, BI tools, and AI systems depend on. You will define data contracts, own schema evolution, and make sure the data you deliver is well-documented and reliable enough for others to build on.
Keep the platform reliable and the data trustworthy - Define and track SLAs (Service Level Agreements) and SLOs (Service Level Objectives) for the pipelines you own, and hold yourself accountable to them. Partner with the SRE (Site Reliability Engineering) team on monitoring, alerting,lineage, and observability best practices. When pipelines fail or data quality degrades, you will lead the response, communicate proactively with stakeholders about impact and timelines, and drive the systemic fixes, not just the immediate patch.
Deliver with ownership and grit - Take projects from ambiguous requirements to production, working through technical blockers, cross-functional dependencies, and competing priorities without losing momentum. Keep stakeholders informed of progress, blockers, and expected timelines throughout, not just when things go wrong. Build a track record of delivering high-quality data assets on time that teams can trust and depend on.
Make it easy for teams to own their data - Build tooling for service and application teams that helps them effectively manage their data and data processes. Coach teams on compliance strategies, performance tuning, event-driven design for data consumption, and schema evolution so they can take ownership of their data without creating a bottleneck on the data team. Translate business requirements into durable, scalable technical solutions and communicate tradeoffs clearly to both technical and non-technical stakeholders.
Build with compliance and trust at the center - Implement data handling practices that meet HIPAA, GDPR, and CCPA requirements including PII (Personally Identifiable Information) handling, access controls, data retention, and making production data safely available in non-production environments.
Raise the bar for the team around you - Participate in hiring and mentor junior engineers. Review designs, give feedback on code, contribute to standards and best practices, and help teammates work through complex technical problems. Be a continuous learner who brings new ideas and uplifts the people around you.

Benefits

Inclusive healthcare and benefits: On top of comprehensive medical, dental, and vision coverage, we offer employees and their family members help with gender-affirming care, tools for family and fertility planning, and travel reimbursements if healthcare isn’t available where you live.
Planning for the future: Start saving for the future with our traditional or Roth 401k retirement plan options which include a 2% company match.
Modern life stipends: Manage your own learning and development
Grow with us through discounted company stock through our ESPP with easy payroll deductions.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume