Senior Software Engineer, ML Inference

CognitivBellevue, WA
6d$160,000 - $210,000Hybrid

About The Position

We are looking for a Senior Software Engineer focused on ML inference to help build and scale the systems that power Cognitiv’s ML-driven products. In this role, you’ll work on performance-critical inference systems that enable real-time decision-making at scale. You’ll collaborate closely with ML Researchers, Product, and other Engineers to design, implement, and optimize production ML services used by some of the world’s biggest brands. This is a hands-on engineering role with meaningful technical ownership and room to grow in scope and influence. Location: Hybrid - MTW out of our Bellevue WA office.

Requirements

  • Experienced ML Engineer: ~4+ years working with ML systems in production, including hands-on experience with PyTorch or LibTorch.
  • Strong Systems Engineer: ~4+ years of professional C++ experience with attention to performance and memory efficiency.
  • Inference-Focused: Experience optimizing models and inference pipelines for real-world constraints like latency and scale.
  • Collaborative Communicator: Comfortable explaining technical tradeoffs and working closely with cross-functional partners.
  • Ownership-Driven: Able to take responsibility for the services you build and improve them over time.
  • Technically Educated: Bachelor’s degree or higher in Computer Science, Engineering, Math, Physics, or a related field.

Nice To Haves

  • Experience with GPU or hardware-accelerated inference (e.g., NVIDIA TensorRT)
  • Experience with Docker and Kubernetes
  • Familiarity with Infrastructure-as-Code tools (Terraform, Ansible)
  • Exposure to advanced ML architectures (e.g., two-tower models, teacher-student learning)
  • Experience with Rust
  • Familiarity with MLOps tooling (monitoring, lifecycle management, automation)
  • Experience using AI-assisted development tools

Responsibilities

  • Build and optimize ML inference systems used in production, leveraging both industry-standard frameworks and in-house technology.
  • Implement performance-critical components in C++ and PyTorch/LibTorch with a focus on latency, throughput, and reliability.
  • Collaborate cross-functionally with ML Research, Product, and Engineering partners to bring models from experimentation into production.
  • Improve existing systems by identifying performance bottlenecks, reliability gaps, and scalability issues.
  • Contribute to design discussions and technical reviews for inference-related services.
  • Write high-quality, production-ready code with strong testing, monitoring, and documentation.
  • Support the full development lifecycle of services you work on, from design through deployment and iteration.
  • Mentor and support teammates through code reviews and knowledge sharing.

Benefits

  • Medical, dental & vision coverage (some plans 100% employer-paid)
  • 12 weeks paid parental leave
  • Unlimited PTO + Work-From-Anywhere August
  • Career development with clear advancement paths
  • Equity for all employees
  • Hybrid work model & daily team lunch
  • Health & wellness stipend + cell phone reimbursement
  • 401(k) with employer match
  • Parking (CA & WA offices) & pre-tax commuter benefits
  • Employee Assistance Program
  • Comprehensive onboarding (Cognitiv University)
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service