Staff Data Engineer - RCM (Remote)

RulaLos Angeles, CA
61dRemote

About The Position

At Rula, our mission is to make mental health care more accessible and effective for those who need it. As a Staff Data Engineer for Operational Reporting, you will oversee the design and implementation of a greenfield near real-time data platform, starting with micro-batching pipelines using Kafka to deliver critical operational reports and evolving into a scalable Apache Flink architecture for sub-second analytics. Your work will power real-time dashboards and insights that enable our providers, leadership, and operational teams to make data-driven decisions, ultimately improving patient outcomes. You will join our collaborative data team, nested within the broader engineering organization, working closely with business analysts, product managers, and data experts to transform raw event streams into reliable, actionable reporting data. Your daily responsibilities—building fault-tolerant pipelines, ensuring data accuracy, and optimizing for low-latency delivery—will lay the foundation for Rula's near real-time data capabilities. This role offers the opportunity to own a strategic transition from micro-batching to a Flink-based streaming architecture, driving innovation in how we harness data to support our mission. If you're passionate about turning complex data into impactful insights that advance mental health care, this is your chance to make a meaningful difference.

Requirements

  • Data Pipeline Development (8+ yrs). Experience designing and maintaining scalable ETL/ELT pipelines for operational reporting using Kafka, Glue, dbt, Dagster, and Airflow. Leveraging Python and SQL for data transformation and quality checks, and working with Flink and Spark Streaming to build low-latency, near real-time pipelines.
  • Cloud Infrastructure & Data Warehousing (8+ yrs overall, 4+ yrs in AWS). Proficiency building and optimizing data pipelines using AWS services such as S3, Redshift, Glue, IAM, Kinesis, and EMR. Experience across GCP (BigQuery, Dataflow) and Azure (Synapse, Data Factory). Optimizing data warehouses (Redshift, Snowflake, BigQuery) and managing Data Lakes (S3, Delta Lake) for scalable, low-latency analytics. Ensuring cost efficiency, scalability, and compliance (CPRA, HIPAA) while supporting a migration toward Flink-based near real-time architecture.
  • Data Quality & Governance (8+ Years). Experience implementing scalable data validation, quality checks (e.g., deduplication, consistency), and error-handling mechanisms tailored for operational reporting pipelines, ensuring high-fidelity data for real-time dashboards and analytics. Proficiency in designing and enforcing data governance practices, including metadata management, lineage tracking for auditable reporting, and compliance with regulations like CPRA or HIPAA in Data Lake environments (e.g., AWS S3, Delta Lake).
  • Performance Optimization (3+ Years). Experience optimizing data pipelines, queries, and large-scale datasets for efficiency and scalability in operational reporting systems, with a focus on achieving low-latency delivery. Proficiency in tuning high-throughput streaming systems, including optimizing resource usage and implementing best practices for partitioning, caching, and indexing.
  • Security & Compliance (3+ Years). Experience implementing data security measures, including encryption, role-based access control (RBAC), and data masking, to protect sensitive data in operational reporting pipelines and Data Lakes (e.g., AWS S3, Delta Lake). Strong understanding of compliance standards such as HIPAA and CPRA, with hands-on expertise in applying these standards to streaming systems like Apache Kafka and Apache Flink. Demonstrated ability to ensure auditability and security in data workflows, supporting reliable and compliant near real-time analytics during the transition from micro-batching to a Flink-based architecture.
  • Collaboration & Communication (5+ Years). Strong ability to work cross-functionally with business analysts, product managers, leadership, and other stakeholders to define and deliver operational reporting requirements. Exceptional communication skills to translate complex technical concepts into clear, actionable insights for non-technical audiences. Proven adaptability to thrive in a fast-paced startup environment, collaborating effectively to support the rapid development and evolution of a near real-time data platform while aligning with Rula's mission to improve mental health care outcomes.

Nice To Haves

  • Hands-on experience with AWS tools like S3, Glue, EMR, SageMaker, and Lambda for building scalable ETL/ELT pipelines optimized for ML/LLM training, including feature engineering, data versioning, and handling large-scale unstructured data
  • Demonstrated ability to maintain data integrity and accuracy in streaming systems like Apache Kafka and Apache Flink, supporting reliable operational insights during the transition from micro-batching to a near real-time architecture.
  • Familiarity with infrastructure as code (IaC) tools like Terraform or CloudFormation for managing cloud resources.
  • Experience implementing and maintaining CI/CD pipelines for data workflows.
  • Demonstrated ability to enhance pipeline performance to support near real-time analytics while maintaining cost efficiency and reliability during the transition from micro-batching to a streaming architecture.
  • Strong ability to partner with data scientists and ML engineers to design efficient pipelines, using orchestration tools (e.g., Airflow, Dagster) for incremental loading and cost optimization, while monitoring performance metrics like latency and resource utilization in AWS environments.

Benefits

  • 100% remote work environment (US-based only)
  • Attractive pay and benefits
  • Comprehensive health benefits : Medical, dental, vision, life, disability, and FSA/HSA
  • 401(k) plan access
  • Generous time-off policies : Including 2 company-wide shutdown weeks each year for self-care (for most employees)
  • Paid parental leave : Available for all parents, including birthing, non-birthing, adopting, and fostering
  • Employee Assistance Program (EAP) : Support for your mental and physical health
  • New hire home office stipend
  • Quarterly department stipend : Fund team-building activities or in-person gatherings
  • Wellness events and lunch & learns
  • Community and employee resource groups
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service