Senior Machine Learning Engineer, Content Engineering

Paramount•New York, NY

49d

About The Position

We are seeking a Senior Machine Learning Engineer to lead the development of our multimodal embedding and retrieval systems that power content discovery across Paramount's video library. In this role, you will own the full lifecycle of multi-modal embedding systems, optimized for text and video understanding, from generation, ingestion and indexing, to retrieval — directly impacting how millions of users discover and engage with short-form clips. You will partner with product leadership, Content and Personalization engineering teams, mentor engineers and serve as a senior technical voice shaping how the platform "sees" and retrieves video clip content at scale.

Requirements

5–8+ years of experience in machine learning engineering, with a focus on production ML systems
Expertise in multimodal ML, including experience with video, image, and/or audio embedding models
Deep knowledge of vector embedding generation, storage and retrieval, with preference for hands-on Qdrant experience (FAISS, Pinecone, Pgvector, AlloyDB or similar also considered)
Strong Python proficiency; Java is a plus
Demonstrated experience building and operating data pipelines at scale, including batch and streaming ingestion workflows
Solid understanding of hybrid retrieval systems: vector search, lexical search, and reranking
Proven ability to communicate technical concepts clearly and partner effectively with product and engineering teams
Track record of mentoring engineers and leading technical decisions in a team setting

Nice To Haves

Experience with agentic systems and multi-agent orchestration
Knowledge of Diversity & Relevance algorithms such as Maximal Marginal Relevance (MMR) within the re-ranking phase
Background in video codecs, FFmpeg, or low-level video processing pipelines
Awareness with retrieval-augmented generation (RAG) systems

Responsibilities

Design and build embedding pipelines for video content metadata and clip-level representation
Design collection and vector schemas to shape data structure, indexing behavior, and retrieval performance under scale and modality complexity
Lead the transition from traditional feature engineering to a vector-centric "context-first" architecture, through compositional queries and by designing high-dimensional hyper-vector representations that unify visual, textual, and behavioral signals
Design offline/online evaluation frameworks (e.g., nDCG, MRR, Recall@K) specifically for multimodal alignment, ensuring content embeddings match search intent
Build hybrid retrieval systems that combine vector similarity search with lexical search and reranking layers to deliver fast, accurate, and scalable performance at production scale
Engineer the retrieval layer to capture nuanced user-content relationships that model training alone cannot surface, combining multimodal embeddings to improve recommendation depth at scale
Implement query-time optimizations including caching, filtering, and index sharding strategies
Tune vector quantization strategies (PQ, SQ, Binary Quantization) to reduce memory footprint and improve search throughput without compromising retrieval precision
Own performance SLAs and monitor retrieval systems for latency, throughput, recall, and cost efficiency
Build and maintain scalable batch and streaming pipelines, with logging, metrics, and alerting to surface anomalies and maintain observability
Process content at scale using distributed frameworks such as Spark or Ray
Architect and build scalable integration layers on top of vector databases, exposing robust APIs and services for similarity search, hybrid retrieval, and metadata filtering
Own model versioning and embedding migration strategies, building compatibility tooling that prevents embedding drift from degrading retrieval quality across model upgrades
Collaborate with backend and platform teams to ensure interoperability with upstream data pipelines and integration with downstream personalization and discovery surfaces
Communicate technical system behavior, tradeoffs, and recommendations clearly to both technical and non-technical stakeholders
Mentor direct reports, providing technical guidance in multimodal ML, vector retrieval, and production systems design
Take ownership of project outcomes from scoping through delivery in a dynamic environment, proactively identifying and mitigating risks across video processing, metadata, and indexing workflows