Xealth-posted 11 days ago
$155,000 - $225,000/Yr
Full-time • Mid Level
Seattle, WA

At Xealth, we're revolutionizing healthcare by leveraging data and automation to empower care providers (building on EHRs such as Epic and Cerner) to seamlessly prescribe, deliver, and monitor digital health for patients. We are a detail-oriented team, committed to maintaining the highest standards while moving with agility and impact. We are a highly skilled, collaborative, and passionate group, applying our expertise to improve health outcomes for millions. We believe in shared ownership and are looking for a team player who is a self-starter and self-driven to pioneer the next generation of intelligent, automated data insights. This role offers a unique opportunity to join a data engineering team to advance our capabilities with data processing pipelines and our analytics product offering. There is a strong preference for this person to sit in the Seattle office; however, we are open to candidates in other locations within the United States.

  • Design, build, and scale the services that power Xealth's Analytics and Reporting Capabilities.
  • Apply solid computer science fundamentals to solve complex problems in distributed systems, data modeling, and data pipelines.
  • Execute expert-level Data Modeling and Design, utilizing dimensional modeling and denormalization techniques specifically for analytic workloads.
  • Ability to consume and process high-volume bounded and unbounded data, build robust Change Data Capture (CDC) mechanisms, and gather data from API calls and webhooks.
  • Design, build, and optimize high-volume, real-time Streaming Data Pipelines utilizing PySpark and Databricks environments.
  • Maintain and scale large Data Lake Pipelines, ensuring high performance and cost-efficiency.
  • Write comprehensive unit and integration tests for data pipelines to ensure code quality and production reliability.
  • Partner with product managers and EHR specialists to translate clinical user behaviors into rich, analytical datasets, unlocking critical insights that drive evidence-based improvements in healthcare processes.
  • Contribute to code reviews, system design discussions, and technical decisions that raise the engineering bar across the team.
  • Use AI-assisted coding tools like GitHub Copilot to streamline development, increase quality, and accelerate delivery.
  • Data Engineering Expertise: 5+ years of professional experience building production-grade data pipelines and applications, with expert proficiency in Python, PySpark and SQL. Familiarity with JavaScript would be a plus. You must have solid hands-on experience working with modern massively parallel data processing systems.
  • CS Fundamentals: Deep understanding of algorithms and data structures, with a specific focus on distributed computing principles (concurrency, partitioning, shuffling) necessary for processing large-scale datasets.
  • Optimization & Troubleshooting: Proficient in diagnosing complex failures in distributed processing jobs (e.g., Spark executor errors, memory leaks, data skew) using logs, distributed tracing, and performance metrics.
  • Modern SQL and Non-SQL Database Design: Deep practical knowledge of open table formats, such as Delta Lake. Proficiency with common big data file formats, including Apache Parquet and Apache Avro.
  • Infrastructure as Code (IaC): Experience implementing IaC principles and tools for the automated deployment and management of data pipelines.
  • API & Integration: Hands-on experience designing robust data ingestion frameworks via RESTful APIs and building event-driven architectures for real-time data flow.
  • Cloud & Distributed Systems: Experience designing and scaling cloud-native data platforms and orchestrating data workloads using AWS and Kubernetes.
  • Security & Governance: Prior experience in regulated industry with high security requirements. A good working understanding of Data Security principles, particularly regarding Protected Health Information (PHI) and sensitive data governance.
  • Real Time Stream Processing: Expertise building streaming data pipelines, leveraging stream processors such as Apache Kafka and Apache Flink
  • Observability: Experience implementing and utilizing Data Observability tools and practices to monitor data quality, lineage, and pipeline health.
  • Visualization: Experience building dashboards and visualizations to communicate data insights effectively
  • The compensation range for this position is $155,000 - $225,000 + bonus, depending on geographic market.
  • Paid parental leave.
  • Comprehensive medical, dental, and vision policies. Xealth covers 100% of employee premiums. We also provide Employee Assistance Programs.
  • Xealth provides your laptop and offers a home office stipend.
  • Generous learning & development opportunities for you to grow your skills and career.
  • 401k Match: Xealth offers a dollar-for-dollar match up to 3%.
  • Flexible time off & 10 standardized holidays.
  • $500 yearly fitness stipend to spend on staying active.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service