Senior AI DevOps Engineer

RokuAustin, TX
Hybrid

About The Position

Roku is the #1 TV streaming platform in the U.S., Canada, and Mexico, and we've set our sights on powering every television in the world. Roku pioneered streaming to the TV. Our mission is to be the TV streaming platform that connects the entire TV ecosystem. We connect consumers to the content they love, enable content publishers to build and monetize large audiences, and provide advertisers unique capabilities to engage consumers. From your first day at Roku, you'll make a valuable - and valued - contribution. We're a fast-growing public company where no one is a bystander. We offer you the opportunity to delight millions of TV streamers around the world while gaining meaningful experience across a variety of disciplines. About the team Join a high-performing, innovative team that plays a pivotal role in Roku’s mission to be the best TV streaming platform in the world. Our team is responsible for delivering intuitive, high-quality mobile applications that enhance the way millions of users interact with Roku devices globally. We pride ourselves on creating products that “just work” — seamlessly and effortlessly. This commitment to excellence is driven by a collaborative, inclusive, and results-oriented culture where your contributions will directly impact the user experience of millions. If you’re passionate about building products that feel magical and intuitive, this is the team for you. About the role The mobile team ships the Roku Remote, Smart Home, and Howdy apps on iOS and Android. You own the CI/CD pipeline and QA automation infrastructure for the mobile engineering team. Your role will use AI to design and build a completely autonomous, self-healing CI/CD and QA automation pipeline for multiple products with millions of users. You treat AI as your primary design tool not an add-on. Every system you build should minimize human intervention, from code push to app store submission. You'll own CI Pipeline Architecture: the path from git push to a green or red signal. Your job is to make that path fast, reliable, and cheap. That includes QA Automation & Device Orchestration: the software systems that schedule, monitor, and recover the test infrastructure.

Requirements

  • 5+ year's operating CI/CD infrastructure at scale, preferably GitLab CI
  • Ability to travel up to 20%
  • Deep understanding of mobile build systems (Xcode/xcodebuild, Gradle) and mobile-specific CI challenges (code signing, provisioning, multi-platform builds)
  • Strong scripting (Python, Bash) and ability to build internal tooling reservation systems, health monitors, pipeline analytics dashboards
  • Advanced proficiency with AI-assisted development (Copilot, Claude Code, Cursor, or equivalent) you use AI as your default approach to writing code, building systems, and solving infrastructure problems
  • Experience designing autonomous, self-healing systems that detect, diagnose, and recover from failures without human intervention
  • AI-first problem solving where your instinct is to automate with AI before adding manual process or headcount
  • Obsession with developer experience.You measure your success by how fast and reliably engineers get feedback, not by how complex your infrastructure is
  • Data-driven decision making. You measure failure rates, waste rates, device utilization, and pipeline duration and you use those numbers to prioritize your work

Nice To Haves

  • Experience with infrastructure-as-code (Terraform, Ansible, or equivalent) for managing cloud and on-premises infrastructure
  • Working knowledge of WiFi and BLE protocols enough to understand why tests that exercise radio communication behaves differently from pure software tests
  • Experience with mobile test automation frameworks (XCUITest, Espresso, Appium) not to write tests, but to understand what they need from infrastructure
  • Experience scaling CI for high-volume, automated code generation (agentic engineering, bot-authored MRs)

Responsibilities

  • Design and maintain CI/CD pipelines for iOS and Android on GitLab CI
  • Architect pipeline stages for fail-fast execution: cheapest checks first (lint, compile, static analysis), expensive checks last (device farm tests)
  • Build smart test routing: analyze MR diffs to determine which tests need physical devices and which can run on emulators, so 80% of MRs never touch the device farm
  • Build flaky test detection and quarantine systems. Classify failures as infrastructure-caused vs. code-caused so engineers trust the signal
  • Automate release mechanics: code signing, versioning, TestFlight/Play Console uploads, dSYM and mapping file management. The goal is zero manual steps between merge and app store submission
  • As agent-authored MR volume grows, ensure pipelines absorb the increase without degrading speed or starving human-authored MRs of resources
  • Build the device reservation and orchestration system that assigns devices to CI jobs, prevents contention, and maximizes utilization without manual scheduling
  • Design self-healing automation: health checks detect unresponsive devices, trigger remote recovery via API, and re-register them no human intervention required
  • Define the device compatibility matrix which firmware/model combinations require real hardware, and which can run on emulators
  • Implement priority-based test routing: device-touching MRs get farm time, UI-only MRs never queue for a device
  • Use AI to identify failure patterns, predict infrastructure issues, and continuously optimize pipeline performance

Benefits

  • Global access to mental health and financial wellness support and resources
  • Healthcare (medical, dental, and vision)
  • Life, accident, disability, commuter, and retirement options (401(k)/pension)
  • Time off in accordance with local leave policies and other personal needs
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service