Principal Data Engineer

CDS GlobalNew York, NY
$230,000 - $250,000Onsite

About The Position

The Enterprise Corporate Data Team is looking for a Principal Data Engineer, a senior technical leader responsible for architecting the core data infrastructure and platforms that power enterprise-scale AI applications. Reporting to the VP of Engineering, this role will focus on building systems to surface content, audience, products etc through semantic search capabilities to support personalization, audience discovery, and intelligent content discovery. The Principal Data Engineer will lead the end-to-end design and implementation of scalable pipelines, platforms and systems that support semantic search across massive volumes of structured/semi-structured data using Gen AI technology. This individual will also co-ordinate with a team of off-shore engineers, ensuring consistent delivery, code quality, and alignment with business and technical goals. The ideal candidate will possess an entrepreneurial ethos, an ability to operate in a dynamic environment, and a working knowledge of the current digital media landscape. The candidate should be an expert/knowledgeable with Search systems including but not limited to Similarity, hybrid and semantic search. This role is based in New York City.

Requirements

  • 10+ years of experience in data engineering, with significant experience building large-scale, distributed data systems to support Data analysis, AI/ ML and key business use cases.
  • Proven expertise in Search and search related sub systems like Query understanding, search suggest, ranking, relevance with modern strategies like similarity search, hybrid search etc.
  • Strong coding and data architecture skills using Typescript, Python, SQL, and tools like Apache Spark, Kafka, Airflow, Node Js, and cloud-native platforms (e.g., AWS, GCP, or Azure).
  • Hands-on experience integrating ML models into production environments for tasks such as entity extraction, text classification, or semantic search.
  • Familiar with AI grounding strategies including but not limited to Entity graph
  • Experience managing and mentoring distributed/offshore engineering teams, with a track record of driving execution across time zones.
  • Excellent communication and collaboration skills, with the ability to bridge technical execution and business strategy.

Nice To Haves

  • Experience in digital media, publishing, ad tech, or content platforms.
  • Knowledge of LLMs and generative AI in applied settings (e.g., content summarization, auto-tagging, retrieval augmentation).
  • Working experience with OLAP and OLTP systems is a plus

Responsibilities

  • Lead the design and implementation of high-performance OLAP and OLTP systems to support similarity and semantic search.
  • Architect scalable data platforms that integrate structured and unstructured data, including behavioral signals, content metadata, and user engagement data for Gen AI use cases.
  • Build systems that enable semantic enrichment of content through entity recognition, disambiguation, normalization and deduplication techniques.
  • Lead the design and build of high throughput, low latency and highly relevant Enterprise search systems using Vectors, Graph and other search strategies.
  • Familiar with relevance measurement techniques like DCG, NDCG etc.
  • Partner closely with other Data engineers, ML engineers and data scientists to deploy and operationalize models for content and audience intelligence.
  • Oversee and co-ordinate with an offshore engineering team, providing technical guidance, code reviews, and project oversight to ensure timely, high-quality deliverables.
  • Ensure best practices in data governance, quality, observability, and documentation across all engineering workflows.
  • Collaborate with stakeholders across product, marketing, and data science to translate business needs into scalable AI data systems.
  • Well versed in architecting, designing and developing large scale OLTP and OLAP systems.
  • Experience building and operating streaming systems using messaging systems like Kafka, Pub/sub, SQS etc.
  • Experience building an RAG/Graph RAG system with Google, OpenAI or another Gen AI platform.
  • Experience building a knowledge graph using Neo4j, Spanner, Neptune or another tool is a plus

Benefits

  • medical
  • dental
  • vision
  • disability
  • life insurance
  • 401(k)
  • paid holidays
  • paid time off
  • employee assistance programs
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service