Senior Cybersecurity Data Engineer - AI/ML SME

WorkdayReston, VA
Hybrid

About The Position

We are a newly formed, forward-looking Cybersecurity Data Engineering & Enablement Team driving the future of our enterprise defense strategy. Our mission is to build a next-generation, centralized data lakehouse that unifies all security telemetry into a single, high-performance ecosystem. Operating across two specialized verticals—Data Engineering (ingestion, enrichment, and semantic layers) and Data Platform (foundational infrastructure, security architecture, and AI enablement)—we are designing a scalable, cloud-native foundation from the ground up. By combining cutting-edge data architecture with advanced analytics, we empower our threat hunters, data scientists, and incident responders with the real-time, trusted intelligence needed to protect the enterprise at scale. We are seeking a highly specialized Senior Data Engineer - Cybersecurity to serve as the Subject Matter Expert (SME) for AI/ML and Platform Integration. This critical role sits at the intersection of core data platform infrastructure, advanced analytics, and external system integrations. Your primary mission is to optimize our data platform to serve as a high-performance engine for Data Science, Machine Learning (ML), and Generative AI (GenAI) workloads. Additionally, you will own the integration fabric of the platform—building the robust APIs, webhook ingestion engines, and data connectors that seamlessly sync our central lakehouse with downstream business applications, SaaS platforms, and third-party ecosystems.

Requirements

  • 5+ years of data engineering experience, with at least 2+ years dedicated to supporting machine learning platforms, MLOps, or complex platform integrations.
  • Deep hands-on experience with AWS SageMaker, MLflow, or equivalent cloud-native ML platforms.
  • Proven experience implementing feature store frameworks (e.g., Feast, SageMaker Feature Store) and vector databases (e.g., Pinecone, Milvus, Qdrant, or Pgvector).
  • Strong experience using Apache Spark / AWS EMR, Ray, or Dask to process massive datasets for feature extraction and model preparation.
  • Expert knowledge of building rest APIs, Webhooks, and utilizing streaming tools (e.g., AWS Kinesis, Kafka) for real-time integration.
  • Advanced proficiency in Python (including ML ecosystems like Pandas, NumPy, Scikit-Learn) and SQL.
  • Extensive experience with GitHub Actions, GitLab CI, or Jenkins for data/ML pipelines.

Nice To Haves

  • Experience deploying and fine-tuning open-source LLMs or orchestrating AI agents using frameworks like LangChain or LlamaIndex.
  • Experience with reverse-ETL tools (e.g., Census, Hightouch) or enterprise integration platforms.

Responsibilities

  • Design, provision, and maintain the platform infrastructure required for end-to-end machine learning lifecycles.
  • Optimize the platform for distributed training, model evaluation, and batch/real-time inference.
  • Design and manage the enterprise Feature Store. Ensure consistent, low-latency feature delivery, preventing data leakage between training pipelines and real-time production inference.
  • Architect and maintain vector databases and indexing pipelines required to support Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) patterns, and semantic search.
  • Serve as the SME for how external applications interact with the data lakehouse. Design, build, and secure high-throughput APIs, data connectors, and reverse-ETL patterns to sync data back into business systems (e.g., CRMs, ERPs, marketing automation).
  • Partner closely with Data Scientists and MLOps teams to establish CI/CD automation for ML (MLOps).
  • Transition experimental, unoptimized data science notebooks into resilient, production-grade automated workflows.
  • Configure and optimize compute engines tailored for heavy mathematical and data science workloads (e.g., Ray, Spark/EMR GPU instances).

Benefits

  • Workday Bonus Plan or a role-specific commission/bonus
  • Annual refresh stock grants
  • Comprehensive benefits
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service