Senior Machine Learning Operations Engineer

Mercury•Portland, OR

11h•$166,600 - $208,300

About The Position

Mercury's use of machine learning in risk decisioning is growing fast in scope and in stakes. Models increasingly drive real-time decisions about fraud and financial crime, and the Machine Learning Platform (MLP) team exists to build a paved path from a trained model to a reliable production deployment, speeding up iteration, and ensuring granular production observability. MLP owns the production ML lifecycle: the systems that take a model from registry through deployment, real-time inference, observability, and retraining. Our Data Science colleagues author and train the models; we build the platform that lets them register, deploy, and observe those models in production without carrying the operational burden themselves — and we serve low-latency, highly available scores to the decision engine that depends on them. The platform supports business decisioning broadly, with our first use cases focused on fraud risk outcomes. At Mercury, we are committed to crafting an exceptional banking experience for startups. Our team is passionately focused on ensuring our products create a safe environment that meets the needs of our customers, administrators, and regulators. Mercury is a fintech company, not an FDIC-insured bank. Banking services provided through Choice Financial Group and Column N.A., Members FDIC.

Requirements

5+ years in machine learning engineering, backend software engineering, MLOps, or a closely related field
Production ML service experience — deploying, serving, and operating models in low-latency, high-availability contexts
Strong backend engineering fundamentals in Python, with API frameworks like FastAPI or Flask
Experience with model deployment and lifecycle tooling: model registries, CI/CD for models, versioning, and staged rollout patterns (shadow, canary, champion/challenger)
Experience building observability and alerting for production services — latency, errors, and ideally model-specific signals like drift
Comfort with the data layer ML depends on: SQL, key-value/low-latency stores (Redis, DynamoDB, or equivalent), and streaming pipelines (Kafka, Kinesis, Redpanda, or equivalent)

Nice To Haves

Familiarity with a modern data stack (Snowflake, dbt, Dagster, Airflow, or similar)
Experience operating in a regulated, audit-sensitive, or compliance-adjacent environment
Exposure to functional languages or willingness to work across a stack that includes Haskell, React, and TypeScript

Responsibilities

Build and operate the real-time inference service that scores models for the risk decision engine, with low latency and high availability as first-class requirements
Own model deployment infrastructure — registry and versioning, CI/CD with performance, bias, and consistency checks, shadow mode, and staged rollouts
Build model observability: availability, latency, and error monitoring, plus drift detection as a retraining trigger
Partner with Risk Data Science to take models from a clean development-to-production handoff through to production operation under MLP ownership
Implement experimentation capabilities such as champion/challenger and canary routing, and explainability outputs like SHAP attributions
Feel a strong sense of product ownership and actively seek responsibility — we self-organize on small and medium projects, and we want someone excited to help shape and build a brand-new platform team