About The Position

We’re looking for a Senior Data Engineer who builds data infrastructure with velocity and precision. You’ll design pipelines, architect lakehouse solutions, and create the foundation that powers our products—leveraging modern AI tools to move faster without sacrificing quality. You bring deep experience with ClickHouse, open-source data tooling, and modern lakehouse patterns.

Requirements

  • 5+ years of experience in data engineering, analytics engineering, or related roles
  • Deep expertise with ClickHouse - deployment, optimization, schema design, and materialized views
  • Strong experience with Postgres and understanding of when to leverage transactional vs. analytical databases
  • Strong experience with lakehouse architecture patterns (Delta Lake, Apache Iceberg, Apache Hudi)
  • Proficiency building ETL/ELT pipelines with open-source tools (Airflow, Dagster, dbt, Prefect, or similar)
  • Hands-on experience with streaming and batch processing frameworks (Kafka, Flink, Spark)
  • Strong SQL and deep proficiency in Python
  • TypeScript proficiency for integration with application services
  • Demonstrated fluency with AI coding assistants (Cursor, Copilot, Claude, etc.) as part of your daily workflow
  • Experience using LLMs for data transformation, validation, or pipeline generation
  • A spec-first mindset - you document what you’re building before you build it
  • Experience with real-time analytics and sub-second query requirements
  • Familiarity with data contracts, schema registries, and data mesh principles
  • Contributions to open-source data projects
  • Background working in cross-functional product teams

Responsibilities

  • Design and implement lakehouse architecture using open-source technologies
  • Build and optimize ClickHouse deployments for high-performance analytical workloads
  • Develop custom data transforms and ETL/ELT pipelines using well-supported open-source tools
  • Create data models that bridge our Postgres application databases with ClickHouse analytics layer
  • Partner with product and engineering to define data models that serve both analytical and operational needs
  • Write specifications before writing code—defining contracts, schemas, and expected behaviors upfront
  • Use AI-assisted coding tools daily to accelerate development and reduce toil
  • Establish data quality frameworks and observability across the pipeline
  • Optimize for performance, cost, and reliability at scale
  • The company reserves the right to add or change duties at any time.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service