About The Position

Join the team that owns the vision to enable high-quality, fast, and seamless software and model delivery for Apple's Search and Recommendation products. Serving billions of queries, our self-service intelligent platform is the backbone of engineering velocity. You will drive a mission-critical function, ensuring a fast, reliable, and secure deployment journey for over 1,000 developers. DESCRIPTION Own and evolve the end-to-end Developer Platform for CI/CD and releases, ensuring a fast, reliable, and secure deployment journey for software and models. Own the technical strategy and roadmap for large-scale Developer Platform—CI/CD and release systems, aligning with business-level priorities. Lead the design and rollout of platform changes that affect 1000+ developers, from design through adoption. Identify and drive impactful KPIs (e.g., 2× faster feedback cycles, 50–70% reduction in flaky test noise, automated recovery). Partner with engineering teams to onboard new services and models onto the platform, providing self-guided and self-service tooling. d/test/release workloads on cloud or large (AWS, GCP) clusters.

Requirements

  • 8+ years of software engineering, including 4+ years focused on developer platforms, CI/CD, release engineering, and/or production infrastructure.
  • Deep hands-on ownership of large-scale CI/CD systems and build/test infrastructure (microservices, modularized stack) with a track record of designing, not just maintaining, these systems.
  • Strong coding in at least one systems/backend language (e.g., Python, Go), plus scripting for automation and tooling.
  • Expertise in containers and orchestration (Docker + Kubernetes or equivalent) and running build/test/release workloads on cloud or large (AWS, GCP) clusters.
  • Champion cross-team collaboration with Developers, Infra, Quality, Performance, and SRE groups.
  • Lead ‘Incident Response Rooms’ to resolve time-sensitive release issues swiftly and effectively.
  • Proven experience running robust release processes: branching strategies, quality gates, feature flag management, staged rollouts, rollbacks, and gathering insights through post-release analysis.
  • Defined and owned SLOs for developer workflows: Deployment Journey—duration, success rate, flaky rate, deployment MTTResolution, release volume.
  • Built robust monitoring and alerting for release platforms; led root cause analyses (RCAs) and implemented sustainable fixes.
  • Demonstrated data-driven decision-making: used metrics and experiments to prioritize work and prove impact (e.g., "optimize the average time-to-green-build by 40%").

Nice To Haves

  • AI-driven developer productivity: Hands-on experience integrating AI/LLM capabilities into developer workflows: code review assistance, intelligent test selection, triage bots, and incident summarization.
  • Experience using telemetry and historical data from dev tools to drive recommendations or automation (e.g., failure analysis to accurately identify root causes and propose resolutions).
  • Experience with model deployments is a plus.

Responsibilities

  • Own and evolve the end-to-end Developer Platform for CI/CD and releases, ensuring a fast, reliable, and secure deployment journey for software and models.
  • Own the technical strategy and roadmap for large-scale Developer Platform—CI/CD and release systems, aligning with business-level priorities.
  • Lead the design and rollout of platform changes that affect 1000+ developers, from design through adoption.
  • Identify and drive impactful KPIs (e.g., 2× faster feedback cycles, 50–70% reduction in flaky test noise, automated recovery).
  • Partner with engineering teams to onboard new services and models onto the platform, providing self-guided and self-service tooling.
  • d/test/release workloads on cloud or large (AWS, GCP) clusters.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service