About The Position

We’re looking for a hands-on technical leader to architect, fine-tune, and deploy on-device small language models (SLMs) for consumer security at scale. You’ll lead a focused team of 3–5 senior engineers while remaining deeply involved in the code and technical architecture. Your core responsibility is building high-performance, privacy-preserving AI models that run directly on user devices (Mac, iOS, Android, Linux). You’ll own model optimization, fine-tuning for tool-use accuracy, evaluation frameworks, and cost-aware deployment strategies. While you won’t own the agent orchestration platform itself, you’ll work closely with it to ensure models behave correctly in multi-turn conversations and make reliable tool-calling decisions. This role sits at the intersection of edge ML, applied LLMs, and production engineering. Success requires navigating real-world tradeoffs: latency vs. capability, privacy vs. accuracy, on-device vs. cloud execution, and cost vs. performance. This is not a traditional director role. You’ll spend 60%+ of your time on technical architecture and implementation, with the remainder focused on mentoring senior engineers and setting technical direction. This is a Hybrid remote position located in a hub location of Frisco, TX or San Jose, CA. You will be required to be onsite on an as-needed basis, typically 1-4 days per month. We are only considering candidates within a commutable distance to this location and are not offering relocation assistance at this time.

Requirements

  • 10+ years of software engineering experience, with 5+ years focused on ML/AI
  • Proven experience shipping ML models to production with transferrable skills to deploy these on edge or mobile platforms
  • Experience with conversational AI systems and tool/function-calling architectures
  • Strong Python and systems programming skills (C++ or Rust) for performance-critical code
  • Deep expertise in model optimization (INT4/INT8 quantization, pruning, distillation)
  • Hands-on experience with PyTorch and at least one edge deployment framework (TensorFlow Lite, CoreML, ONNX Runtime, or llama.cpp)
  • Experience building evaluation and benchmarking frameworks for ML systems

Nice To Haves

  • Experience applying ML systems in security, safety, or other adversarial domains
  • Master’s degree in CS, ML, or a related field (or equivalent practical experience)

Responsibilities

  • Design and deploy small language models optimized for on-device inference (Mac, iOS, Android, Linux)
  • Lead model optimization efforts including quantization, pruning, distillation, and efficient inference pipelines
  • Fine-tune models to improve tool selection accuracy and conversational behavior in security-focused workflows
  • Build evaluation frameworks to measure model efficacy, tool-calling accuracy, conversation quality, and safety in production
  • Create synthetic data and workflow simulations to train and validate security-relevant conversations
  • Partner closely with agent orchestration systems to optimize multi-turn dialogue behavior and state handling
  • Implement cost-optimization strategies such as intelligent on-device vs. cloud routing, prompt caching, batching, and token efficiency
  • Integrate cloud-based LLMs when deeper reasoning or broader context is required
  • Build production ML systems that detect threats and protect users directly on-device
  • Set technical standards and architectural direction for AI/ML across the security platform
  • Mentor principal engineers and architects while remaining hands-on

Benefits

  • Bonus Program
  • Pension and Retirement Plans
  • Medical, Dental and Vision Coverage
  • Paid Time Off
  • Paid Parental Leave
  • Support for Community Involvement
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service