Principal Machine Learning Engineer, Short-form

Paramount•New York, NY

2d•$233,000 - $350,000

About The Position

We are looking for a Principal Machine Learning Engineer to set the technical direction for the Shortform pod. Your mission is to define the long-term architecture and modeling strategy for the Personalization of Short-form experiences, including the "in-view" and "in-carousel" surfaces across our "Gist" and clip ecosystem. You will own the technical vision for how short-form assets convert casual browsers into committed viewers at platform scale. This is a Principal role, meaning you are the senior-most technical authority on the pod. You set multi-quarter technical strategy, drive cross-pod alignment, and are accountable for the scientific rigor of how we model long-term user satisfaction. You will work within a GCP-based environment, utilizing TensorFlow and PyTorch, with a heavy focus on Post-training RL (Reinforcement Learning) to optimize session-level and long-horizon rewards. Why This Role Matters Setting the Architecture: You define the multi-stage ranking and RL architecture that determines which short-form asset is surfaced to every user — directly shaping CTR, discovery velocity, and long-term retention. Beyond the Click: You establish how we frame and optimize long-horizon reward signals, ensuring short-form content drives durable engagement rather than short-term engagement traps. Org-Level Quality Bar: You raise the technical bar across the Shortform pod and adjacent pods, anticipate systemic risks (data, modeling, feedback loops), and influence the broader Applied ML roadmap.

Requirements

Minimum: 8+ years of experience in MLE with a track record of setting technical direction for large-scale ranking or recommender systems
deep expertise in Reinforcement Learning, particularly Post-training RL and long-horizon reward modeling
proficiency in GCP, TensorFlow, and PyTorch; demonstrated ability to influence technical strategy across multiple team

Nice To Haves

Experience in video-first social or streaming apps
background in multi-modal signal processing
published work or recognized contributions in ranking, RL, or recommender systems.

Responsibilities

Set Technical Strategy: Own the multi-quarter technical roadmap for short-form ranking, candidate generation, and Post-training RL.
Architect End-to-End Systems: Design the multi-stage ranking architecture spanning retrieval, ranking, re-ranking, and RL-based policy optimization.
Advance reinforcement learning in production. Drive the use of post-training reinforcement learning techniques, including reward modeling, off-policy evaluation, and policy alignment to improve user satisfaction over long periods.
Cross-Pod Influence: Partner with Content Understanding, ML Platform, Core Science, and Product to align short-form personalization with broader Discovery strategy.
Operate at Scale: Ensure ranking pipelines are high-throughput, reliable, and observable in GCP using TensorFlow/PyTorch.
Mentorship & Talent: Mentor IC1–IC3 engineers, set technical standards across the pod, and grow the next generation of senior ML talent.
Mitigate Systemic Risk: Identify and resolve feedback loops, exposure biases, and filter-bubble dynamics in how short-form content is surfaced.