Staff Platform Manager- Payments, Evaluations and Automation

Airbnb

11d•Remote

About The Position

Airbnb was born in 2007 when two hosts welcomed three guests to their San Francisco home, and has since grown to over 5 million hosts who have welcomed over 2 billion guest arrivals in almost every country across the globe. Every day, hosts offer unique stays and experiences that make it possible for guests to connect with communities in a more authentic way. The Community You Will Join: At Airbnb, if there’s anything related to money, it comes to the Payments team. We are building a world-class payments and commerce organization - one that currently supports 191 countries, 70 currencies, connects dozens of payment providers and banks, processes multiple billions of dollars, and empowers more people to participate in our global marketplace. Our Payments platform organization has 2 domains: Payments and Commerce platform - facilitating money movement and exchange of value while also fueling growth for our business. Payments Compliance and Payments Risk - ensuring safe and efficient payment processing The Staff Product Manager, Evals & Automation, reporting to the Product Lead, Payments Risk, will define the strategy and roadmap for how Airbnb evaluates, learns from, and automates risk decisions—ensuring we protect the marketplace while minimizing friction and insult to good users. The Difference You Will Make: This role will own the product vision, strategy, and execution for critical evaluation workflows for risk detection & mitigation Your mandate is twofold: Design principled evaluation frameworks that determine the right size and shape of holdouts, control groups, and manual review samples—without degrading model performance or decision quality. Drive automation that meaningfully reduces manual reviews, operational burden, and customer friction, while preserving the labels and signals required to keep models accurate and resilient over time. Success in this role is measured by sustained reductions in manual review volume, improved approval rates, stable or improving loss performance, and evaluation systems that scale as risk vectors evolve A Typical Day: A typical day involves reviewing how models, holdouts, and manual reviews interact across the decisioning funnel—examining false positives, label coverage, approval lift, and downstream loss impact. You will work closely with Data Science, Machine Learning, Risk Engineering, and Operations to understand where current evaluation approaches over-sample good users, introduce bias, or create unnecessary friction. You will define the end-to-end product vision for risk evaluations and learning loops, building a multi-year roadmap that balances statistical rigor, operational efficiency, customer experience, and regulatory expectations. You will own requirements and execution for systems that: Dynamically size and manage holdouts Optimize when and how manual reviews are invoked Preserve high-quality labels without over-reliance on human review Enable faster, safer iteration on risk models and policies You will partner deeply with Fraud & Safety, Trust, Legal, Policy, Customer Support, and Payments Operations to align on decision principles and ensure evaluation strategies are understood, trusted, and actionable across the company.. Collaborate with Payments Operations and Support to reduce manual effort, handle edge cases better, and unlock high-quality decisioning at scale. Key to success for this role is to partner deeply with other Airbnb platform organizations such as Fraud & Safety, AirCover, Legal, Customer Support, Policy Enforcement & Payments Operations teams to align on the vision and mission and coordinate tactics to deliver mutually aligned outcomes.

Requirements

9+ years of Product Management experience, or 6+ years with a BS/MS with significant work in payments, risk/fraud, chargebacks/, or marketplace integrity.
Demonstrated experience working with model-driven systems, evaluation frameworks, experimentation, and decisioning under uncertainty.
Strong intuition for long-tail risk problems, false positives, and the unintended consequences of sampling and labeling strategies.
Systems thinker who can design learning loops that scale globally and remain robust as adversarial behavior evolves.
Highly data-literate: comfortable working with loss curves, precision/recall tradeoffs, operational metrics, and model performance diagnostics.
Systems thinker with the ability to architect modular processes that scale globally. Ability to make principled, high-judgment decisions in ambiguous, high-stakes environments. Takes ownership of multiple product areas or one with significant impact, establishing clear expectations, driving communication, quality and timely delivery. Experience designing end-to-end user experiences that balance risk mitigation with customer empathy.
Adapts communication style according to the audience across teams, extended stakeholders and leadership. Empathetic, clear communicator who can lead cross-functional partners through influence, not authority. Autonomously communicates, influences and aligns stakeholders on the products vision/strategy
With minimal guidance defines products that sustain the changing needs of business, industry, users and partners with Eng/Design to translate to long term architecture and Ecosystems thinking, with clear understanding of platform capabilities to be built
Strong analytical chops—you’re comfortable diving into financial & customer data, operational metrics, and model performance. Proven ability to partner deeply with engineering, design, operations, data science, and legal/compliance teams.
Comfortable working in a fast-paced environment with evolving regulatory requirements and complex trade-offs.
Authorized to work in the United States

Responsibilities

Design principled evaluation frameworks that determine the right size and shape of holdouts, control groups, and manual review samples—without degrading model performance or decision quality.
Drive automation that meaningfully reduces manual reviews, operational burden, and customer friction, while preserving the labels and signals required to keep models accurate and resilient over time.
Reviewing how models, holdouts, and manual reviews interact across the decisioning funnel—examining false positives, label coverage, approval lift, and downstream loss impact.
Work closely with Data Science, Machine Learning, Risk Engineering, and Operations to understand where current evaluation approaches over-sample good users, introduce bias, or create unnecessary friction.
Define the end-to-end product vision for risk evaluations and learning loops, building a multi-year roadmap that balances statistical rigor, operational efficiency, customer experience, and regulatory expectations.
Own requirements and execution for systems that: Dynamically size and manage holdouts, Optimize when and how manual reviews are invoked, Preserve high-quality labels without over-reliance on human review, Enable faster, safer iteration on risk models and policies
Partner deeply with Fraud & Safety, Trust, Legal, Policy, Customer Support, and Payments Operations to align on decision principles and ensure evaluation strategies are understood, trusted, and actionable across the company.
Collaborate with Payments Operations and Support to reduce manual effort, handle edge cases better, and unlock high-quality decisioning at scale.
Partner deeply with other Airbnb platform organizations such as Fraud & Safety, AirCover, Legal, Customer Support, Policy Enforcement & Payments Operations teams to align on the vision and mission and coordinate tactics to deliver mutually aligned outcomes.