Senior Data Engineer (AI)

South GeeksWyoming, NY
1dRemote

About The Position

We’re looking for a Senior Data Engineer who thrives at the intersection of data engineering and applied AI. This is a hands-on, high-ownership role where you will design, build and operate systems that extract, transform, and validate structured data from complex leasing documents. You will own the full ELT loop turning messy, real-world documents into clean, reliable JSON that powers web applications and downstream systems. In this early-stage environment, iteration and agility are key. You’ll scope ambiguous problems, experiment with AI-driven extraction techniques, and continuously refine pipelines to improve accuracy and scalability.

Requirements

  • Strong Python engineering experience building data extraction and transformation workflows.
  • Experience calling LLM APIs (OpenAI, Anthropic, or similar) and crafting prompts for structured data extraction.
  • Solid understanding of ELT patterns and data pipeline architecture.
  • Experience working with AWS S3 (or similar object storage) and PostgreSQL (or similar relational databases).
  • Experience designing JSON schemas and handling nested or semi-structured data.
  • Strong data validation mindset and experience implementing quality controls.
  • Ability to work independently in a fast-moving, early-stage environment.

Nice To Haves

  • Experience building document processing pipelines (PDFs, contracts, leases, or similar).
  • Experience evaluating and comparing LLM outputs for consistency and accuracy.
  • Familiarity with AI orchestration platforms.
  • Background in real estate, leasing, or financial document processing.

Responsibilities

  • Design and iterate data extraction and transformation pipelines that convert unstructured leasing documents into structured JSON stores.
  • Write and optimize LLM API calls and prompts to extract and interpret text data at scale.
  • Orchestrate AI-driven workflows integrating multiple LLM models to handle diverse document types and edge cases.
  • Build and maintain ELT workflows in Python, managing data flows between cloud storage and relational databases.
  • Develop data quality and validation frameworks to ensure structured outputs are accurate and production-ready.
  • Implement monitoring, alerting, and automated quality checks across extraction pipelines.
  • Collaborate with product and engineering teams to define and evolve data schemas.
  • Own the pipeline end-to-end — from raw ingestion to validated structured output.

Benefits

  • Long-term projects
  • 100% remote work
  • Payment in USD
  • Paid Time Off (PTO)
  • Work-from-home & training reimbursement
  • English lessons
  • Technical training
  • Career coaching
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service