Staff ML/LLM Ops Engineer

LVTSeattle, WA
$213,300 - $272,000

About The Position

We are seeking a Staff ML/LLM Ops Engineer to own the model lifecycle as infrastructure that turns the path from research to production into standardized self-serve tooling. The model portfolio this platform serves spans both the computer-vision models in production today and a growing set of LLM, VLM, and agentic workloads. Bringing those generative workloads under the same lifecycle discipline: serving, version-pinning, evaluation, guardrails, and cost and latency monitoring is a part of this role's scope. This is a senior individual-contributor and technical-leadership role. You will partner closely with AI/ML research, the application backend team, and platform and infrastructure teams. You should be equally comfortable discussing model-serving architectures, CI/CD and rollback design, polyglot service contracts, and production observability.

Requirements

  • 8+ years of engineering experience with deep ML-infrastructure / MLOps work, including building and operating a model deployment, serving, and monitoring platform in production.
  • Hands-on experience operating LLM or VLM workloads in production including model serving or managed-provider integration, prompt and version management, generative evaluation, guardrails, and token cost and latency control.
  • Experience designing self-serve ML deployment for other teams, including model registry and packaging, CI/CD for models, serving contracts, rollback, and drift/quality monitoring.
  • Strong systems and API design judgment across a polyglot boundary with the operational maturity to own security, observability, and on-call trade-offs.
  • A track record of setting technical direction and leveling up engineers (technical leadership; formal management not required).
  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.

Nice To Haves

  • Computer Vision / video model inference at scale (GPU serving, latency and cost optimization).
  • Cloud-native infrastructure (Kubernetes, Argo, or a comparable deployment stack).
  • Experience standing up an ML platform from zero on a team that did not have one.
  • Experience deploying AI models to edge environments (e.g. NVIDIA Jetson or similar).
  • Agentic and generative tooling: LangGraph, MCP frameworks, vector databases, and inference/serving platforms.

Responsibilities

  • Own the model lifecycle end to end: standardized packaging, a model CI/CD path, a serving layer with stable, versioned contracts, automated deployment and rollback, and monitoring and drift detection.
  • Bring LLM, VLM, and agentic workloads under the same platform discipline as the vision models serving with models and prompts version-pinned as deployable, rollback-able artifacts; generative evaluation and regression suites that don't reduce to precision/recall; production guardrails such as input/output filtering and jailbreak and refusal monitoring; and token-level cost and latency observability. Where retrieval or agent orchestration is in play, own the operational seams (vector stores, request tracing) the same way.
  • Make the path from research to production self-serve and safe by encoding the security, observability, and on-call guardrails engineers enforce by hand today, so model owners can ship without lowering the operational bar.
  • Define and own the contract boundary between the model platform and the application backend so engineers integrate against deployed models independently.
  • Set technical standards and mentor IC productionization work toward the platform, growing the function as the team forms.

Benefits

  • Comprehensive health, dental and vision coverage
  • Retirement benefits (401k match up to 4%)
  • Flexible PTO
  • Employee equity program
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service