AI Technical Lead

NIOSan Jose, CA
$192,100 - $249,600

About The Position

NIO is a pioneer and a leading company in the premium smart electric vehicle market. Founded in November 2014, NIO’s mission is to shape a joyful lifestyle. NIO aims to build a community starting with smart electric vehicles to share joy and grow together with users. NIO designs, develops, jointly manufactures and sells premium smart electric vehicles, driving innovations in next-generation technologies in autonomous driving, digital technologies, electric powertrains and batteries. NIO differentiates itself through its continuous technological breakthroughs and innovations, such as its industry-leading battery swapping technologies, Battery as a Service, or BaaS, as well as its proprietary autonomous driving technologies and Autonomous Driving as a Service, or ADaaS. NIO’s product portfolio consists of the ES8, a six-seater smart electric flagship SUV, the ES7 (or the EL7), a mid-large five-seater smart electric SUV, the ES6, a five-seater all-round smart electric SUV, the EC7, a five-seater smart electric flagship coupe SUV, the EC6, a five-seater smart electric coupe SUV, the ET7, a smart electric flagship sedan, and the ET5, a mid-size smart electric sedan. Roles and Responsibilities Architect the Hybrid AI Vision: Lead the architectural design and strategic vision for hybrid inference systems, dynamically distributing Large Language Model (LLM) and Vision-Language Model (VLM) workloads across edge computing environments and cloud infrastructure. Team Leadership & Innovation: Lead, mentor, and inspire a team of specialized engineers working across distributed systems orchestration, inference optimization, and AI compiler engineering. While you are not expected to be a hands-on master of every domain, you will drive the overarching technical roadmap, foster a culture of cutting-edge innovation, and guide domain experts in navigating complex system tradeoffs. Design Dynamic Orchestration & Resilience: Oversee the architecture of high-availability orchestration engines that intelligently route inference tasks. Guide the team in developing cascading inference mechanisms, dynamic model fallback strategies, and robust telemetry to ensure continuous, steady-state inference under varying connectivity constraints.

Requirements

  • Ph.D. in Computer Science, Computer Engineering, Artificial Intelligence, or a related field with 8+ years of relevant industry experience (or Master’s degree with 12+ years), including proven experience leading technical teams or driving complex architectural roadmaps.
  • Demonstrated capability to lead full-stack AI systems engineering.
  • Deep, hands-on mastery in at least one or two of the following core domains, coupled with the comprehensive systemic breadth required to effectively lead engineers working across the others: Distributed Systems & Hybrid Inference: Designing, scaling, and deploying production-grade distributed ML systems. Balancing cloud infrastructure with edge constraints using modern routing paradigms, such as cascading inference architectures and semantic routing.
  • Proven experience optimizing state-of-the-art LLM/VLM inference pipelines.
  • Deep understanding of model compression (e.g., PTQ, QAT, AWQ, FP8/INT4), hardware-aware compute optimizations (e.g., FlashAttention), and advanced memory management (e.g., PagedAttention, KV cache compression/eviction).
  • C++ and production-grade Python proficiency.
  • Deep understanding of edge/cloud model-serving frameworks (e.g., vLLM, TensorRT-LLM, ExecuTorch, MLC-LLM) and AI compilers (e.g., MLIR, Apache TVM, Triton) for compute graph optimization and custom kernel development.

Nice To Haves

  • Deep understanding of privacy-preserving AI techniques (federated learning, differential privacy, secure enclaves) essential for processing sensitive data across edge and cloud environments.
  • Publications in relevant AI, ML, or systems conferences (e.g., NeurIPS, ICML, MLSys), or active contributions to open-source ML infrastructure projects (e.g., vLLM, ONNX Runtime, Apache TVM, llama.cpp).

Responsibilities

  • Lead the architectural design and strategic vision for hybrid inference systems, dynamically distributing Large Language Model (LLM) and Vision-Language Model (VLM) workloads across edge computing environments and cloud infrastructure.
  • Lead, mentor, and inspire a team of specialized engineers working across distributed systems orchestration, inference optimization, and AI compiler engineering.
  • Drive the overarching technical roadmap, foster a culture of cutting-edge innovation, and guide domain experts in navigating complex system tradeoffs.
  • Oversee the architecture of high-availability orchestration engines that intelligently route inference tasks.
  • Guide the team in developing cascading inference mechanisms, dynamic model fallback strategies, and robust telemetry to ensure continuous, steady-state inference under varying connectivity constraints.

Benefits

  • Anthem Blue Cross, HSA, and Kaiser HMO medical plans with $0 for Employee Only Coverage.
  • Dental (including orthodontic coverage) and vision plan. Both provide options with a $0 paycheck contribution covering you and your eligible dependents.
  • Company Paid HSA (Health Savings Account) Contribution when enrolled in the High Deductible Anthem Blue Cross medical plan
  • Healthcare and Dependent Care Flexible Spending Accounts (FSA)
  • 401(k) with Brokerage Link option
  • Company paid Basic Life, AD&D, short-term and long-term disability insurance
  • Employee Assistance Program
  • Sick and Vacation time
  • 13 Paid Holidays a year
  • Paid Parental Leave for first 8 weeks at full pay (eligible after 90 days of employment with NIO)
  • Paid Disability Leave for first 6 weeks at full pay (eligible after 90 days of employment with NIO)
  • Voluntary Life and AD&D options for you, your spouse/domestic partner and dependent child(ren)
  • Pet insurance
  • Commuter benefits
  • Mobile Cell Phone Credit
  • Free lunch and snacks
  • Onsite gym
  • Employee discounts and perks program
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service