Data Scientist

TrellixFrisco, TX

About The Position

Join our innovative team at Trellix, where you'll be instrumental in building the evaluation and benchmarking infrastructure for our cutting-edge agentic AI platform. This role sits at the intersection of data science and AI engineering — you'll own the science of how we know our AI works, designing evaluation frameworks, curating test datasets, and measuring the performance of AI agents, knowledge graphs, and foundation models across the Trellix security portfolio.

Requirements

  • 5+ years of professional experience in data science, ML engineering, or AI research, with hands-on work in evaluation or benchmarking of AI/ML systems.
  • Strong proficiency in Python (pandas, NumPy, scikit-learn)
  • Statistical analysis and experimental design
  • Experience building and managing datasets for ML training and evaluation
  • Familiarity with annotation workflows and data quality frameworks
  • Hands-on experience evaluating Large Language Models (LLMs)
  • Familiarity with evaluation frameworks such as RAGAS, HELM, EleutherAI LM Eval, or equivalent
  • Experience designing LLM-as-judge pipelines or preference evaluation workflows
  • Understanding of hallucination detection, groundedness, and faithfulness metrics
  • Experience testing or evaluating agentic AI systems
  • Familiarity with tool use, ReACT-style, Deep Agents, and multi-agent coordination patterns
  • Ability to define pass/fail criteria for complex, multi-step agent tasks
  • Experience working with knowledge graphs (NebulaGraph, Neo4j, or equivalent)
  • Ability to evaluate graph quality, ontology coverage, and traversal correctness
  • Familiarity with embedding-based retrieval and vector databases (Qdrant preferred)
  • Experience with synthetic data generation for model and agent testing
  • Proficiency with vector databases and embedding pipelines
  • Familiarity with MLflow, Weights & Biases, Langfuse, or similar experiment tracking tools
  • Strong analytical thinking and ability to translate ambiguous quality questions into measurable metrics
  • Excellent written communication — able to document evaluation methodologies and present findings to technical and non-technical stakeholders
  • Collaborative mindset with a bias toward rigor and reproducibility

Nice To Haves

  • AWS experience preferred
  • Familiarity with the cybersecurity domain strongly preferred
  • Understanding of SOC workflows, threat detection, and incident response a plus
  • Experience evaluating AI systems in high-stakes or regulated environments a plus

Responsibilities

  • Architect and implement rigorous evaluation pipelines for agentic AI systems, including multi-step reasoning agents, retrieval-augmented pipelines, and autonomous SOC workflows.
  • Design and execute model evaluations to assess accuracy, reliability, latency, and safety across LLMs and agentic systems, including custom benchmarks tailored to cybersecurity use cases.
  • Develop methods to validate knowledge graph quality, coverage, and correctness including entity resolution, relationship accuracy, and graph completeness metrics.
  • Build, curate, and maintain high-quality synthetic and real-world datasets for training, fine-tuning, and testing models and agents — including adversarial and edge-case datasets.
  • Design structured test harnesses for agentic systems covering tool use, multi-agent coordination, hallucination rates, decision quality, and task completion fidelity.
  • Define and instrument evaluation metrics, surface results through dashboards, and translate findings into actionable insights for engineering and product teams.
  • Stay current with the latest evaluation methodologies (e.g., LLM-as-judge, RAGAS, MT-Bench, custom evals) and adapt them to Trellix's security domain.
  • Partner closely with AI engineers, product managers, and security researchers to align evaluation standards with real-world performance requirements.

Benefits

  • Social programs
  • Flexible work hours
  • Family-friendly benefits
  • Retirement Plans
  • Medical, Dental and Vision Coverage
  • Paid Time Off
  • Paid Parental Leave
  • Support for Community Involvement
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service