Test Triage & Automation Engineer, Siri

Apple•Cupertino, CA

59d

About The Position

We are seeking a Senior Software Engineer to join our Siri AI Client Platforms Quality Engineering team. In this role, you will be dealing with high volumes of test data and evaluation pipelines requiring both technical depth and creative thinking to build solutions that can keep pace with the rapid evolution of AI technologies. You will be responsible for designing, driving, triaging, evaluating automation results that support the qualification of Siri's AI features, not just validating what features are supposed to do, but creatively measuring qualitative experiences the way a real user would. This means going beyond pass/fail metrics and thinking deeply about how Siri's responses feel, how natural the interactions are, and how well the product truly serves our customers in the real world. You will work closely with Product, Platform engineers, and program managers to understand how Siri behaves, how they evolve, and how changes in the underlying AI stack ripple through to the customer experience. Your insights and findings will directly influence product decisions, model improvements, and feature launches. We are a fast-paced, and deeply collaborative team that values the tight relationship between Quality engineering, Product engineering, and program management. We move quickly, we hold ourselves to the highest standards, and we genuinely care about the products we build. If you are someone who thrives in an environment where your work matters, where innovation is encouraged, and where you can see the direct impact of your contributions on a global scale, this is the team for you.

Requirements

5+ years of experience designing, implementing, and optimizing large-scale data-driven platforms and frameworks, APIs, services, and tools
Thorough understanding of system, architecture and large-scale system design
Strong programming skills with Swift, Python and Shell scripting languages
Experience building dashboards and analytics solutions using tools like Tableau, Grafana, Superset, or Splunk to visualize KPIs and monitor data quality
Demonstrated success in collaborating cross-functionally with engineering, machine learning, and data science teams to solve complex challenges
Ability to proactively triage, investigate, and debug difficult technical and UX problems independently as well as collaboratively
Capacity to drive test triage products, methodologies, and processes
Proficiency with software revision control (e.g. Git) and CI/CD systems (e.g. Jenkins)
Highly organized with strong planning skills to estimate, update, and communicate progress
BS/MS in Computer Science, Engineering, or a related field.

Nice To Haves

Deep understating about large scale data validation platforms with focus on privacy
Experience building tooling solutions with Claude tools
Knowledge of statistics-based evaluation approaches, ML training pipelines, and techniques for enhancing the accuracy of ML systems
Strong attention to detail and the proven ability to delve into data, uncover hidden patterns, and conduct comprehensive error/deviation analysis

Responsibilities

designing, driving, triaging, evaluating automation results that support the qualification of Siri's AI features
measuring qualitative experiences the way a real user would
thinking deeply about how Siri's responses feel, how natural the interactions are, and how well the product truly serves our customers in the real world
work closely with Product, Platform engineers, and program managers to understand how Siri behaves, how they evolve, and how changes in the underlying AI stack ripple through to the customer experience
insights and findings will directly influence product decisions, model improvements, and feature launches

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume