Data Engineer

Lakeview Loan ServicingNew York, NY
Remote

About The Position

The Data Engineer on the Nebula team plays a critical role in building and evolving the data foundation that powers analytics, reporting, AI development, and operational decision-making across the organization. This role is responsible for designing, building, and maintaining reliable, scalable, and flexible data systems that support a wide range of internal and external use cases. Working across data ingestion, transformation, storage, modeling, and delivery, this individual partners closely with Product, Engineering, AI, Analytics, and domain Subject Matter Experts (SMEs) to translate complex business processes and data needs into production-ready data pipelines and platforms. This role contributes to the development and evolution of core data capabilities, including batch and real-time pipelines, operational and analytical data stores, semantic models, and BI-ready datasets. Success requires strong technical depth across modern data tooling, sound systems thinking, and the ability to build reliable solutions in a cloud-based, regulated, high-stakes environment. The Data Engineer is expected to operate effectively in a modern engineering environment, using automation, observability, and infrastructure-as-code practices to deploy, manage, and improve data pipelines and data platforms. In parallel, this individual will help enable downstream analytics, reporting, product capabilities, and AI systems by ensuring that data is trustworthy, accessible, and fit for purpose.

Requirements

  • 2-4+ years of experience building and operating production-grade data pipelines and data systems
  • Strong experience with industry-standard tools and platforms for ETL/ELT, orchestration, data warehousing, streaming, and BI
  • Experience working with both OLTP and OLAP systems, with a strong understanding of the tradeoffs between transactional and analytical workloads
  • Experience building flexible data pipelines that integrate with many different source and destination types, including databases, APIs, files, message queues, SaaS platforms, and event streams
  • Experience supporting both batch and real-time data processing patterns
  • Experience deploying and operating data infrastructure on major cloud platforms such as AWS, GCP, or Azure
  • Strong SQL skills and experience with data modeling, transformation frameworks, and performance optimization
  • Experience building AI-powered capabilities on top of LLMs, including orchestration, evaluation, and data integration patterns
  • Experience with modern programming languages commonly used in data engineering, such as Python, Java, Scala, or Go
  • Comfort working with CI/CD, infrastructure-as-code, observability, and production operations for data systems
  • Strong judgment in ambiguous environments where requirements evolve and systems must balance speed, reliability, and flexibility
  • Clear communication skills with both technical and non-technical teammates

Nice To Haves

  • Experience with modern orchestration and transformation tools such as Airflow, Dagster, dbt, or similar platforms
  • Experience with cloud-native data warehouses or lakehouse platforms such as Snowflake, BigQuery, Redshift, Databricks, or equivalent technologies
  • Experience with streaming and real-time data platforms such as Kafka, Kinesis, SQS, or similar systems
  • Experience enabling BI and self-service analytics through curated datasets, semantic layers, and reporting platforms such as Looker, Power BI, Tableau, or similar tools
  • Experience in fintech, mortgage, lending, payments, insurance, or other regulated domains
  • Experience building data platforms that support AI, machine learning, or decisioning workflows
  • Experience improving data quality, reliability, cost efficiency, and platform scalability as a system grows

Responsibilities

  • Design, build, and maintain robust data pipelines for a wide variety of input and output sources, including internal systems, third-party platforms, files, APIs, event streams, and databases
  • Develop scalable ETL and ELT workflows for both batch and real-time processing
  • Ensure pipelines are reliable, testable, observable, and easy to extend as business needs evolve
  • Build reusable data integration patterns that support growing volumes, new source systems, and downstream consumers across analytics, applications, and AI initiatives
  • Design and manage data architectures that support OLTP, OLAP, and reporting workloads across operational and analytical environments
  • Build and optimize data models, warehouse schemas, and curated datasets for analytics and BI use cases
  • Contribute to the design and operation of modern data platforms, including warehouses, lakehouses, streaming systems, and supporting orchestration frameworks
  • Help define patterns for data storage, partitioning, performance optimization, retention, and lifecycle management
  • Deploy, operate, and improve data pipelines and data stores on major cloud platforms such as AWS, GCP, or Azure
  • Use infrastructure-as-code, CI/CD, and automation practices to improve deployment speed, consistency, and reliability
  • Monitor production data systems using logging, alerting, and observability tooling to proactively identify and resolve issues
  • Support secure, resilient, and cost-conscious operation of cloud-based data infrastructure
  • Implement data quality checks, validation rules, reconciliation processes, and monitoring to ensure trustworthy data across systems
  • Establish and maintain standards for lineage, documentation, metadata, schema evolution, and operational runbooks
  • Partner with stakeholders to improve data accessibility, consistency, and usability while maintaining appropriate controls and governance
  • Contribute to practices that support security, privacy, auditability, and compliance in a regulated environment
  • Partner closely with Product, Engineering, and business stakeholders to understand data needs, workflows, and constraints
  • Translate business and operational requirements into clean, scalable, and maintainable data solutions
  • Support downstream consumers of data, including analysts, researchers, product teams, and operational users
  • Communicate clearly with both technical and non-technical stakeholders about data availability, quality, tradeoffs, and delivery timelines
  • Continuously improve pipeline performance, reliability, scalability, and developer productivity
  • Identify opportunities to simplify architecture, reduce operational toil, and improve data platform leverage across teams
  • Operate with a strong bias toward action and iterative delivery, moving quickly from problem definition to implementation and improvement
  • Help raise the bar on engineering quality through thoughtful design, testing, documentation, and operational discipline

Benefits

  • medical coverage starting on day one
  • company-matched 401(k)
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service