Staff Data Engineer

brightwheel
2d

About The Position

Brightwheel is seeking a Staff Data Engineer and technical lead on our Data Engineering team. As a Staff Data Engineer at brightwheel, you will architect and drive the evolution of our data platform, partnering with technical leadership to shape our data and AI strategy. You will design and scale sophisticated data pipelines processing billions of records across diverse systems, powering analytics for internal teams, customer-facing insights, and AI/ML capabilities that differentiate our product. You are a technical leader with deep data engineering expertise. You are passionate about the intersection of data engineering and AI. You thrive on complex architectural challenges and have the vision to balance immediate business needs with long-term platform scalability. You excel at driving technical decisions and influencing across the organization to deliver high-impact outcomes. You are a curious, strategic thinker who takes full ownership of critical initiatives with enterprise-wide visibility. You navigate ambiguity effectively, juggle competing priorities, and have the technical depth and communication skills to drive consensus on complex problems. You're excited to shape the future of data and AI at brightwheel while delivering measurable value to our customers and business.

Requirements

  • 6+ years of work experience as a data engineer or DevOps engineer with strong proficiency in Python and modern data engineering practices
  • 5+ years experience deploying and managing data processing infrastructure as Code (IaC) in AWS or other cloud environments (GCP, Azure)
  • 3+ years experience building and maintaining streaming data pipelines in production (Kafka, Kinesis, Pub/Sub) with data lake/lakehouse architectures (Delta Lake, Iceberg, Hudi)
  • Experience designing, developing, and deploying ML/LLM/AI pipelines in production environments, including experience with model serving, feature engineering, and MLOps practices
  • Expert-level understanding of distributed data processing technologies and their internals (e.g., Spark execution model, query optimization in Redshift/BigQuery/Snowflake, storage formats like Parquet/ORC)
  • Proven track record of independently architecting scalable data solutions, from requirements gathering and technical design through implementation and cost optimization, with focus on long-term maintainability and ROI

Nice To Haves

  • Proven track record of technical leadership, including mentoring senior engineers, driving engineering standards and best practices, and influencing data platform strategy across the organization
  • Hands-on experience architecting federated query engines (DuckDB, Trino, Presto, Starburst) over lakehouse platforms, including catalog integration (Glue, Iceberg, Hudi), query optimization strategies, and cost-effective compute scaling patterns
  • Experience architecting and scaling production ML/LLM inference pipelines, including model serving infrastructure (SageMaker, Vertex AI, Bedrock), vector databases, and real-time feature stores
  • Hands-on experience building RAG (Retrieval Augmented Generation) systems, semantic search pipelines, or LLM-powered applications with production-grade prompt engineering, context management, and evaluation frameworks
  • Deep expertise building orchestration platforms with Airflow (or similar), including custom operators, dynamic DAG generation, and framework-level optimizations for complex dependency management
  • Experience integrating and architecting around enterprise CRM platforms (Salesforce, HubSpot), including custom objects, complex data models, and bidirectional sync patterns
  • Advanced experience with serverless and event-driven architectures, including designing systems that leverage AWS Lambda, Step Functions, EventBridge, or Databricks workflows for cost-efficient, auto-scaling data processing
  • Experience building customer-facing embedded analytics solutions (Cube.js, Metabase, Superset, or similar) with complex data modeling, access control, and performance optimization

Responsibilities

  • Architect and lead the evolution of our modern data platform, driving technical decisions on tooling, infrastructure patterns, and scalability strategies that support both traditional analytics and AI/ML workloads at scale
  • Design and build production LLM pipelines and infrastructure that power intelligent operations.
  • Own end-to-end data acquisition and integration architecture across diverse sources (CRMs, clickstream, third-party APIs), establishing patterns and frameworks that enable self-service data access while maintaining data quality and governance
  • Drive the technical roadmap for customer-facing analytics and AI-powered insights, partnering with product and engineering teams to deliver embedded analytics, predictive models, and intelligent recommendations that differentiate our product
  • Lead performance optimization and architectural improvements across critical data infrastructure, identifying bottlenecks, re-architecting legacy systems, and implementing cost-effective scaling strategies that reduce latency and operational costs
  • Establish and evolve data engineering best practices, including advancing our CI/CD maturity, infrastructure-as-code standards, observability frameworks, and deployment automation across AWS services
  • Mentor and influence engineering culture, conducting design reviews, providing technical guidance to engineers across the organization, and championing data platform adoption and best practices

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

251-500 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service