Senior Applied AI Engineer (Multimodal Perception & Reasoning)

Volt•San Francisco, CA

29d

About The Position

VOLT is building the next generation of AI perception systems for the physical world, focused on safety, security, and real-time risk detection. We are seeking a Senior Applied AI & Machine Learning Engineer to design, optimize, and ship multimodal AI models that operate reliably in real-world environments. This is a deeply applied role, centered on taking models from data to productionâacross both edge devices and cloud infrastructure. You will work on vision, video, and language-based models that understand real-world scenes and events, and you will be accountable for their accuracy, latency, robustness, and cost in production systems. This role reports directly to the Head of Engineering and plays a critical role in advancing VOLT AIâs core perception platform.

Requirements

8+ years of experience in applied machine learning or AI systems
Strong hands-on experience with vision, video, or multimodal models
Proven experience taking models into production, not just research prototypes
Deep understanding of model optimization (quantization, pruning, performance tuning)
Proficiency in Python and modern ML frameworks (e.g., PyTorch)
Experience evaluating models using real-world metrics and constraints
Ability to operate independently and own complex technical systems end to end

Nice To Haves

Experience with multimodal or vision-language models (CLIP-like, BLIP-like, or custom)
Experience deploying models to edge or resource-constrained environments
Familiarity with inference optimization stacks (ONNX, TensorRT, CUDA)
Experience working on physical-world perception systems (video, sensors, environments)
Background in safety, security, robotics, or autonomous systems
Experience mentoring senior engineers or providing technical leadership

Responsibilities

Build, fine-tune, and deploy production-grade multimodal models for safety and security applications, with a focus on visual and video perception, language-assisted and multimodal reasoning, and temporal understanding of real-world environments
Own the full applied ML lifecycle, including data collection, labeling strategies, and dataset curation, model fine-tuning, evaluation, and iteration, and deployment, monitoring, and continuous improvement in production
Drive model performance in real-world conditions, optimizing for high precision and recall, low false positives and false negatives, and robustness to noise, lighting changes, occlusion, and domain shift
Optimize models for edge and cloud deployment, including quantization, pruning, and model compression, latency, throughput, and memory optimization, and hardware-aware tuning for GPUs and edge accelerators
Build and maintain training and inference pipelines that support scalable experimentation and evaluation, reproducibility and model versioning, and reliable production deployment
Collaborate closely with infrastructure and systems engineers to integrate models into real-time perception pipelines, balance accuracy, performance, and cost constraints, and diagnose and resolve production inference issues
Use real-world deployment feedback and metrics to drive data and model improvements

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume